laminlabs / lamindb-setup

Setup & configure LaminDB.
Apache License 2.0
4 stars 1 forks source link

🏗️ Update instance schema in the hub #774

Closed fredericenard closed 1 month ago

fredericenard commented 1 month ago

Here is the relevant PR to create the table in the hub:

github-actions[bot] commented 1 month ago

🚀 Deployed on https://665f2d83461e2000951dae6a--lamindb-setup-htry.netlify.app

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 96.77419% with 8 lines in your changes missing coverage. Please review.

Project coverage is 82.40%. Comparing base (d0b57ad) to head (4862687). Report is 2 commits behind head on main.

Files Patch % Lines
lamindb_setup/_schema_metadata.py 97.15% 7 Missing :warning:
lamindb_setup/_migrate.py 50.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #774 +/- ## ========================================== + Coverage 81.22% 82.40% +1.18% ========================================== Files 40 41 +1 Lines 2866 3104 +238 ========================================== + Hits 2328 2558 +230 - Misses 538 546 +8 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

fredericenard commented 1 month ago

@falexwolf upon instance migration we need to call schema_metadata.to_json() and store it in the hub.

image
falexwolf commented 1 month ago

Great!

I have some comments re storing this in the hub repo.

falexwolf commented 1 month ago

@fredericenard, I'm getting this error now:

____________________________________________________________________________________________________________ test_synchronize_new_schema ____________________________________________________________________________________________________________

setup_instance = None

    def test_synchronize_new_schema(setup_instance):
>       is_new, schema = synchronize_schema()

tests/hub-local/test_synchronize_schema.py:14: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
lamindb_setup/_schema_metadata.py:33: in synchronize_schema
    return call_with_fallback_auth(_synchronize_schema)
lamindb_setup/core/_hub_client.py:135: in call_with_fallback_auth
    raise e
lamindb_setup/core/_hub_client.py:128: in call_with_fallback_auth
    result = callable(**kwargs, client=client)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

client = <supabase._sync.client.SyncClient object at 0x127f19420>

    def _synchronize_schema(client: Client):
        schema_metadata = SchemaMetadata()
        schema_metadata_dict = schema_metadata.to_json()

        schema_encoded = json.dumps(schema_metadata_dict, sort_keys=True).encode("utf-8")
        schema_hash = hashlib.sha256(schema_encoded).digest()
        schema_uuid = UUID(bytes=schema_hash[:16])

        schema = _get_schema_by_id(schema_uuid, client)

        is_new = schema is None
        if is_new:
            module_set_info = schema_metadata._get_module_set_info()
            module_ids = "-".join(str(module_info["id"]) for module_info in module_set_info)
            schema = (
                client.table("schema")
                .insert(
                    {
                        "id": schema_uuid.hex,
                        "module_ids": module_ids,
                        "module_set_info": module_set_info,
                        "json": schema_metadata_dict,
                    }
                )
                .execute()
                .data[0]
            )

        instance_response = (
            client.table("instance")
            .update({"schema_id": schema_uuid.hex})
            .eq("id", settings.instance._id.hex)
            .execute()
        )
        assert (
>           len(instance_response.data) == 1
        ), f"Instance {settings.instance._id.hex} was not properly linked to schema {schema_uuid.hex}"
E       AssertionError: Instance 71a2057efcb55b01823887e5558926db was not properly linked to schema 52dcd0f43a173aec3972705497ad8315

lamindb_setup/_schema_metadata.py:71: AssertionError
-------------------------------------------------------
falexwolf commented 1 month ago

In the migration dialogue, I'm raising this error is the user is incorrect: https://github.com/laminlabs/lamindb-setup/blob/99d29ea05e162285f0755f2f68980d744e5b268b/lamindb_setup/_migrate.py#L88-L97

fredericenard commented 1 month ago

@fredericenard, I'm getting this error now:

____________________________________________________________________________________________________________ test_synchronize_new_schema ____________________________________________________________________________________________________________

setup_instance = None

    def test_synchronize_new_schema(setup_instance):
>       is_new, schema = synchronize_schema()

tests/hub-local/test_synchronize_schema.py:14: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
lamindb_setup/_schema_metadata.py:33: in synchronize_schema
    return call_with_fallback_auth(_synchronize_schema)
lamindb_setup/core/_hub_client.py:135: in call_with_fallback_auth
    raise e
lamindb_setup/core/_hub_client.py:128: in call_with_fallback_auth
    result = callable(**kwargs, client=client)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

client = <supabase._sync.client.SyncClient object at 0x127f19420>

    def _synchronize_schema(client: Client):
        schema_metadata = SchemaMetadata()
        schema_metadata_dict = schema_metadata.to_json()

        schema_encoded = json.dumps(schema_metadata_dict, sort_keys=True).encode("utf-8")
        schema_hash = hashlib.sha256(schema_encoded).digest()
        schema_uuid = UUID(bytes=schema_hash[:16])

        schema = _get_schema_by_id(schema_uuid, client)

        is_new = schema is None
        if is_new:
            module_set_info = schema_metadata._get_module_set_info()
            module_ids = "-".join(str(module_info["id"]) for module_info in module_set_info)
            schema = (
                client.table("schema")
                .insert(
                    {
                        "id": schema_uuid.hex,
                        "module_ids": module_ids,
                        "module_set_info": module_set_info,
                        "json": schema_metadata_dict,
                    }
                )
                .execute()
                .data[0]
            )

        instance_response = (
            client.table("instance")
            .update({"schema_id": schema_uuid.hex})
            .eq("id", settings.instance._id.hex)
            .execute()
        )
        assert (
>           len(instance_response.data) == 1
        ), f"Instance {settings.instance._id.hex} was not properly linked to schema {schema_uuid.hex}"
E       AssertionError: Instance 71a2057efcb55b01823887e5558926db was not properly linked to schema 52dcd0f43a173aec3972705497ad8315

lamindb_setup/_schema_metadata.py:71: AssertionError
-------------------------------------------------------

Because it's using a local storage, we need to register the instance to ensure it's in the hub.

fredericenard commented 1 month ago

@falexwolf it's ok to merge on my end

falexwolf commented 1 month ago

Ok, I'll add it to the migrations deploy command before!