Closed nekrut closed 1 month ago
@nekrut we have the following inputs:
Organism List https://docs.google.com/spreadsheets/d/1NRfTvebPl6zJ0l9tCqBtq6YCrwV6_XDBlheq3L5HcvQ/edit?gid=1516139693#gid=1516139693
UCSC Track List https://hgdownload.soe.ucsc.edu/hubs/BRC/assembly.list.json
We currently take the Species and the Strain from the Organism list. What kind of harmonization needs to be done here? The UCSC track list has the NCBI taxon ID we could use to create the links e.g.,
{
"taxId": 1068625,
"asmId": "GCA_000227395.2_ASM22739v2",
"genBank": "GCA_000227395.2",
"refSeq": null,
"identical": false,
"sciName": "Trypanosoma congolense IL3000",
"comName": "Trypanosoma congolense (IL3000 2011)",
"ucscBrowser": "https://genome.ucsc.edu/h/GCA_000227395.2"
},
Finally, is this the type of NCBI taxon page you are referring to https://www.ncbi.nlm.nih.gov/datasets/taxonomy/5866/