dictyBase / Migration

Entrypoint for dictybase overhaul project
0 stars 0 forks source link

Separate Protein Name from Protein Synonyms #88

Open pfey03 opened 7 years ago

pfey03 commented 7 years ago

Currently for protein names, they are all 'Alternative Protein Names'. But we usually have a primary protein name , and we should handle these like gene names, primary and alternative.

Thus on the Gene Page AND protein tab it would be 'protein name', and an extra line for 'alternative protein names:

Protein Name: ABP34 Alternative Protein Names: p30, p34

The above is gene abpB / DDB_G0279081, which has currently 3 'alternative protein names'. http://dictybase.org/gene/abpB Which one to chose as primary?? Maybe you can implement certain rules?

  1. Choose Upper case (first character is upper case or all)
  2. Choose shortest
  3. First

If no Upper Case then Shortest, should help to choose primary protein name. If that doesn't give unique, simply choose First. Maybe after migration we can get a list of all multiples that have been automatically separated, and we curators check.

@rjdodson

pfey03 commented 6 years ago

This needs to be also done by curation. It's not easy to automatically say for historic protein names which is the primary. So maybe we need to just say first, then download a table and maybe we can correct and upload corrected data, that would be great. And future we add like for gene names, a primary and alternative names