Closed AlanSimmons closed 7 months ago
This will not be part of the generation framework script driven by build_csv.py.
The RefSeq data will be added as a set of Definition nodes joined to HGNC IDs. This is a straightforward addition to the DEF.csv and DEFrel.csv files. The bulk of the work will be connecting to the NCBI eUtils API to obtain information on a large number of genes.
It is now possible to ingest RefSeq summary information for genes.
Script code complete. Dev UBKG instance ready for upload to Globus. There is currently a problem with Globus.
Globus problem resolved.
Dependency on https://github.com/x-atlas-consortia/hs-ontology-api/issues/14
Request
Add to the UBKG the summary for each gene in HGNC shown in the mockup below.
Summary information is maintained by RefSeq.
RefSeq information is available via FTP download of source as described here. The NCBI's eUtils REST API also provides summary information. Sample link: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=gene&id=604,1&retmode=json