Open ttngu207 opened 4 years ago
Proposed solution number 1:
Special feature for filepath@store
(and potentially attach@store
to have meta_information attached to it.
Example of how that may look like:
@schema
class NWBRaw(dj.Manual):
definition = """
-> Session
---
nwbfile: filepath@store
"""
NWBRaw.insert1({**session_key, 'nwbfile': (nwb_filepath, {'object_id': obj_uuid})})
fp, meta = (NWBRaw & session_key).fetch1('nwbfile', fetch_meta=True)
Example dj.AttributeAdapter
for NWB object with this feature:
class NWBObjectAdapter(dj.AttributeAdapter):
attribute_type = 'filepath@store'
def put(self, nwbobj):
nwb_fp = pathlib.Path(nwbobj.container_source)
obj_id = nwbobj.obj_id_to_store
return nwb_fp, dict(object_id=obj_id)
def get(self, filepath): # returned as a tuple: (filepath, meta)
nwb_fp, meta_dict = filepath
io = pynwb.NWBHDF5IO(filepath.as_poxis(), mode='r')
nwbf = io.read()
return nwbf.objects[meta_dict['object_id']]
The fetch_meta
argument in fetch
may be unnecessary. If the user inserts the filepath with metadata, it will come back with metadata as a tuple. That would be cosistent and intuitive: you always fetch what you insert.
Nah, that just won't be a reasonable interface as just having one entry with meta can disrupt it. We do need to have a clean separation for when meta is returned vs not.
The separation is clean. If you insert a tuple, you fetch it back. It's simple, does not need to be explained. Users get back what they insert. If they choose to insert some records with metadata and some without, that's what they will get back too — straightforward and transparent.
It will just be much nicer to be able to fully expect if you are going to get a tuple vs list of strings precisely corresponding to the filepath. Meta provision should really be optional with no chance of disrupting the main usage of obtaining back the filepath.
there is a clear separation between actual data and metadata, and I find it completely consistent we treat them separately. Let's proceed with fetch_meta
based behavior and discuss further as we see the examples.
Agreed. Yes, the option of skipping the metadata will be helpful.
Perhaps by default, fetch_meta=None
, which means fetch whatever you inserted. fetch_meta=True
returns tuples always. fetch_meta=False
returns the paths only.
Hm, potentially. Although I'd really think it's enough to offer True/False
behavior defaulting to False
.
Then this would introduce the inconsistency that you insert one thing and fetch another. The default behavior needs to be most consistent.
You are inserting metadata along with the data, and for that to be treated differently sounds just fine to me. It's not quite the same situation as inserting a tuple and expecting tuple back for a blob.
Is there a good reason to treat metadata differently? It's all just data. Special behaviors require extra documentation and explanations. Fetching what is inserted is consistent behavior through all other cases. If the user does not like it, they will look for the feature to skip the metadata.
Here is a more complete example using the custom data type for NWB objects.
class NWBTrace(dj.AttributeAdapter):
"""
custom datajoint attribute type for NWB objects in NWB files
"""
attribute_type = 'filepath@store'
def put(self, nwbobj):
nwb_path = nwbobj.container_source
return nwb_path, nwbobj.trace_id_to_store
def get(self, filepath): # returned as a tuple: (filepath, meta)
nwb_path, object_id = filepath
return pynwb.NWBHDF5IO(nwb_path, mode='r').read()[object_id]
nwb_trace = NWBTrace()
@schema
class Ephys(dj.Manual):
definition = """
-> Session
---
trace: <nwb_trace>
"""
...
Ephys.insert1({**session_key, 'trace': (nwb_filepath, obj_uuid))
trace = (Ephys & session_key).fetch1('trace')
An example use-case is to work with NWB files more elegantly. For a particular NWB object, we need to store 2 things:
object_id
- varchar(36) andnwb_file
- filepath@store Currentlydj.AttributeAdapter
does not support this, so a workaround is to uselongblob
and store a tuple:but this workaround implementation would not support
filepath@store
type, which is crucial for working with NWB objects