AllenNeuralDynamics / aind-data-transfer-service

FastAPI service to run data compression and transfer jobs on the hpc
MIT License
1 stars 0 forks source link

Upload independent assets through transfer service #173

Closed rcpeene closed 2 weeks ago

rcpeene commented 1 month ago

Describe the solution you'd like We'd like to upload information which are associated with certain mice to code ocean, but aren't sessions in the way the the transfer service expects. Specifically, we have the anatomy information with all the CCF coordinates. This is usually done weeks after the ecephys and behavior is ready for upload. So we'd like to be able to generate a new asset that just contains this information for this mouse, separate from the ecephys session asset that would have been uploaded much earlier.

Describe alternatives you've considered If its easy to just use the code ocean api to do this on my own, then perhaps this isn't needed. It's not clear to me what considerations should be made to determine if this is a good use case for the transfer service. It would be very nice to have this triggerable the same way though.

jtyoung84 commented 1 month ago

@saskiad @dbirman Saskia mentioned that we should do this by modifying the metadata, which can be done without creating a new data asset.

rcpeene commented 4 weeks ago

Also, it seems like CCF will soon be done entirely on code ocean, so we won't have this need much longer. Might be worth considering for other session-related data though.

How would the metadata be modified to contain other non-metadata?

saskiad commented 4 weeks ago

I want to make sure I understand: the CCF information you are adding is CCF coordinates for the recorded units in a session? Or is it something else? This needs to live within the same asset as the ecephys. I don't think it should be a separate asset, distinct from the ecephys data it's part of. That would be a new derived asset. raw ecephys -> sorted ecephys -> sorted/aligned ecephys

Or is this something else altogether?

rcpeene commented 3 weeks ago

You're correct that this is the CCF coordinates for recorded units. The issue is that the OPT anatomy processing often isn't completed until several weeks after the ecephys is uploaded and processed. So we've been uploading a 'CCF' asset for the session separately. We could reupload the information together with the CCF files as well I suppose. It would automatically rerun the spikesorting and eye tracking which might be redundant/wasteful (and create an asset that is mostly redundant)

dbirman commented 3 weeks ago

@rcpeene this isn't a situation where the transfer service should be used for the reasons you mentioned. We should either create a derived asset in CO or push this to the metadata.

But I have a confusion here because I didn't think units were stored in the metadata. To quickly summarize, the situation is that after histology for a session is processed an alignment is run and the histology-aligned CCF coordinates are now available and we want to put this information with the sorting results. My confusion is that I didn't think unit locations were stored in the metadata at all.

@saskiad can you confirm what behavior here is correct? Should (1) a new derived asset be created with histology-aligned coordinates added as a file, or (2) we upload the coordinates via the aind-data-access-api and change something in the metadata.

saskiad commented 3 weeks ago

You are correct that units are not stored in the metadata. This should result in a new derived asset that has the aligned coordinates. It doesn't require re-running the spikesorting/eyetracking, it should only require adding the coordinates into the units table in the NWB file.

dbirman commented 3 weeks ago

@rcpeene Does that all make sense? I'll close this issue with your confirmation.

rcpeene commented 2 weeks ago

Makes sense to me