greenelab / django-genes

A Django package to represent genes
BSD 3-Clause "New" or "Revised" License
2 stars 3 forks source link

bitbucket: Confusion over standard_name and systematic_name #7

Open rzelayafavila opened 6 years ago

rzelayafavila commented 6 years ago

Bitbucket issue #5 (priority: minor) https://bitbucket.org/greenelab/adage-server/issues/13/determine-how-to-link-to-ml_source_file

@dhimmel commented: """ I'm tying to understand the difference between standard_name and systematic_name. According to management/commands/genes_load_geneinfo.py

In the gene_info.gz file from Entrez, the third column is Symbol and the fourth is LocusTag. Here is the definition I found for LocusTag:

Locus tag corresponds to the systematic feature qualifier used by the international sequence collaboration (INSDC, DDBJ/EMBL/GenBank) and can be assigned by sequence submitters as a unique, systematic gene descriptor. When such a value is not available from submitted sequence, the identifier from a collaborating model organism database is used. Locus tag is often used to anchor a link to a database other than Gene. Locus tag may also be used as the preferred symbol if an official symbol has not been identified for a gene.

The Entrez Gene file Homo_sapiens.gene_info.gz doesn't have LocusTag. Both Homo_sapiens.gene_info.gz and gene_info.gz have a column Symbol_from_nomenclature_authority.

So what do standard_name and systematic_name refer to? And should these fields be renamed to from name to symbol? """