Closed jp-dark closed 1 month ago
Summary of proposed changes:
Top level changes:
spatial
collection now only contains the scene. All catalog metadata gets bumped to the Experiment level.obs_id
and scene_id
.
scene_id
be an integer soma_joinid
or the names of the Scenes stored in spatial
?Scene changes:
img
collection for images.obsl
and varl
to store location data only.obssm
and varsm
(names TBD) for additional obs/var dataframes/point clouds.
- The
spatial
collection now only contains the scene. All catalog metadata gets bumped to the Experiment level.
@jp-dark -
obs
and obs_scene
dataframes in Experiment
.
- Decision: Should
scene_id
be an integersoma_joinid
or the names of the Scenes stored inspatial
?
@jp-dark @pablo-gar - Correct me if I am wrong but my understanding of the a Collection
is that it is a string keyed map. If spatial
is such a collection and if it is to be keyed by a scene_id
, then wouldn't the scene_id
necessarily have to be a string
?
- The
spatial
collection now only contains the scene. All catalog metadata gets bumped to the Experiment level.1. I might have missed it but where is the catalog metadata stored in at the Experiment level? I only see `obs` and `obs_scene` dataframes in `Experiment`. 2. By _"catalog metadata"_ are referring to metadata about _scenes_ or something else?
"Catalog metadata" was a poor descriptor on my part. I'm talking about any dataframes/arrays that describe the scenes and/or the relationships between scenes are other pieces of the Experiement (right now just obs_scene
but we might want to add more).
- Decision: Should
scene_id
be an integersoma_joinid
or the names of the Scenes stored inspatial
?@jp-dark @pablo-gar - Correct me if I am wrong but my understanding of the a
Collection
is that it is a string keyed map. Ifspatial
is such a collection and if it is to be keyed by ascene_id
, then wouldn't thescene_id
necessarily have to be astring
?
As is, yes**. If we key on a soma_joinid
, we would need to add a mapping from the scene name to the join ID.
** Groups in TileDB are actually primarily keyed by index with an optional string name, but I believe the SOMA implementation treats collections as string keyed maps.
@jp-dark (cc: @pablo-gar) - For the obsl
, varl
, and obssm
and varssm
some more detailed comments about their meaning would be helpful. If possible, an illustrative example would not be bad either - For instance, obsl
contiains columns: obs_soma_joinid: int
, geometry: SomaGeometry
, etc.
Especially since we will inevitably be changing things, these comments will be helpful for the future.
@jp-dark - After you add the comments, let me know and I will approve this PR
This PR provides a proposed restructuring to the spatial components.