galaxyproject / brc-analytics

MIT License
0 stars 4 forks source link

Link species name to NCBI Taxon page #82

Closed nekrut closed 1 month ago

nekrut commented 2 months ago
  1. Use the taxid field of the UCSC JSON to create links like https://www.ncbi.nlm.nih.gov/datasets/taxonomy/5866
  2. Make the species link to to the NCBI taxonomy page.
NoopDog commented 2 months ago

@nekrut we have the following inputs:

Organism List https://docs.google.com/spreadsheets/d/1NRfTvebPl6zJ0l9tCqBtq6YCrwV6_XDBlheq3L5HcvQ/edit?gid=1516139693#gid=1516139693

UCSC Track List https://hgdownload.soe.ucsc.edu/hubs/BRC/assembly.list.json

We currently take the Species and the Strain from the Organism list. What kind of harmonization needs to be done here? The UCSC track list has the NCBI taxon ID we could use to create the links e.g.,

    {
      "taxId": 1068625,
      "asmId": "GCA_000227395.2_ASM22739v2",
      "genBank": "GCA_000227395.2",
      "refSeq": null,
      "identical": false,
      "sciName": "Trypanosoma congolense IL3000",
      "comName": "Trypanosoma congolense (IL3000 2011)",
      "ucscBrowser": "https://genome.ucsc.edu/h/GCA_000227395.2"
    },

Finally, is this the type of NCBI taxon page you are referring to https://www.ncbi.nlm.nih.gov/datasets/taxonomy/5866/