Closed keller-mark closed 1 month ago
Right, we'll need to rethink save_vitessce_config()
for artifacts with manually generated storage paths. It was out-of-scope for the initial version of this function and that's why the validation error is thrown.
Can you give a PR a try? It'd be very meaningful for the public use cases of the Vitessce integration, both cellxgene and the Vitessce example datasets aren't managed via lamindb and hence violated the f".lamindb/{UID}{SUFFIX}"
pattern.
I suggest to pass a mapping between artifact uids & storage paths to save_vitessce_config()
or within the VitessceConfig
option. But the UX here should be mainly dictated by what you'd see as most convenient, @keller-mark. It comes down to identifying whatever is a storage url with an artifact (and doing this for potentially multiple artifacts).
Transferred artifacts do not necessarily include a lamindb uid within the trailing filename of their S3 path.
For example:
However the current
save_vitessce_config
assumes that all artifacts include a lamindb uid in their S3 path, so the error on this line is thrown for transferred artifacts from cellxgene: https://github.com/laminlabs/lamindb/blob/ca4ce4a9f54cded358a01e73be8494e28dc8a796/lamindb/integrations/_vitessce.py#L72The current assumption is that S3 paths are structured like