microbiomedata / sample-annotator

NMDC Sample Annotator
https://microbiomedata.github.io/sample-annotator/static/intro.html
5 stars 9 forks source link

allow GOLD NMDC pipeline to filter based on biosample IDs #112

Open sujaypatil96 opened 2 years ago

sujaypatil96 commented 2 years ago

Currently the gold_nmdc_pipeline.py transformation pipeline, that retreives data from GOLD and transforms it into NMDC compliant JSON has a file where you can enter a list of GOLD project IDs that you can use to subset the data the pipeline is retrieving from GOLD.

Add functionality to subset based not only on GOLD project IDs, but also on GOLD biosample IDs.

Note: Make sure the file consistently has either Gp's or Gb's and not both. Maybe a following PR can support a mixture of both.

sujaypatil96 commented 2 years ago

@ssarrafan: this is an important issue that I will work on and close in the coming sprint.

ssarrafan commented 2 years ago

@sujaypatil96 does this mean this is part of the Deo Squad work? I will add to that sprint. @turbomam does this sound right?

ssarrafan commented 1 year ago

@sujaypatil96 Any update on this issue? Can this issue be closed today or by Tuesday?

ssarrafan commented 1 year ago

Discussed during Squad meeting 10/5 and moving to backlog.