Open sujaypatil96 opened 2 years ago
@ssarrafan: this is an important issue that I will work on and close in the coming sprint.
@sujaypatil96 does this mean this is part of the Deo Squad work? I will add to that sprint. @turbomam does this sound right?
@sujaypatil96 Any update on this issue? Can this issue be closed today or by Tuesday?
Discussed during Squad meeting 10/5 and moving to backlog.
Currently the gold_nmdc_pipeline.py transformation pipeline, that retreives data from GOLD and transforms it into NMDC compliant JSON has a file where you can enter a list of GOLD project IDs that you can use to subset the data the pipeline is retrieving from GOLD.
Add functionality to subset based not only on GOLD project IDs, but also on GOLD biosample IDs.
Note: Make sure the file consistently has either Gp's or Gb's and not both. Maybe a following PR can support a mixture of both.