As part of the efforts in the NCBI Export squad, one of the requirements that has come up is the need to be able to retrieve DataObjects (ids and URLs) given a Biosample id.
Ideally, this would be a specific case of NMDC Database roll-up, but since we don't have the "machinery" for that just yet, we will need to implement something custom for this use case in the meantime.
The code for the NCBI Export squad is being developed in PR #518
The two cases that we need to cover are:
The given Biosample id may be a direct input (through has_input key) on an OmicsProcessing record, the output (through has_output key) of which will be one or two DataObject ids, and we need to retrieve the DataObject records for those ids, or
The given Biosample id may be input into a lab processing class (Pooling, Extraction, LibraryPreparation) the output (through has_output key) of which will be a ProcessedSample, and that ProcessedSample will be input (through has_input key) into an OmicsProcessing record
Implementation details:
We can develop this either as an API endpoint or just as an @op and use it in code. Which would be better?
We can use the get_mongo_db() method or the mongo resource. Which would be better?
I'm also thinking that the method I implement will iterate over all the records in the alldocs collection
As part of the efforts in the NCBI Export squad, one of the requirements that has come up is the need to be able to retrieve DataObjects (ids and URLs) given a Biosample id.
Ideally, this would be a specific case of NMDC Database roll-up, but since we don't have the "machinery" for that just yet, we will need to implement something custom for this use case in the meantime.
The code for the NCBI Export squad is being developed in PR #518
The two cases that we need to cover are:
has_input
key) on an OmicsProcessing record, the output (throughhas_output
key) of which will be one or two DataObject ids, and we need to retrieve the DataObject records for those ids, orhas_output
key) of which will be a ProcessedSample, and that ProcessedSample will be input (throughhas_input
key) into an OmicsProcessing recordImplementation details:
@op
and use it in code. Which would be better?alldocs
collection