geneontology / go-site

A collection of metadata, tools, and files associated with the Gene Ontology public web presence.
http://geneontology.org
BSD 3-Clause "New" or "Revised" License
45 stars 89 forks source link

users.yaml and dbxrefs.yaml should be looked at for proper entries #856

Open dougli1sqrd opened 5 years ago

dougli1sqrd commented 5 years ago

We have been adding entries to groups.yaml and dbxrefs.yaml in order to accommodate assigned_by and DB columns in GAF annotations not currently in the groups and dbxrefs yamls.

We have recently updated the GAF parser and it obeys the aspect of gorule-0000027 which says assigned_by should be in groups and the DB column should be in dbxrefs. This has caused many annotations to be filtered or warned from the parser. We're trying to add to these files so fewer annotations are filtered. We should ensure that these additions are correct or up to date.

There are a few assigned_by groups we are not correcting because there are reasonable duplicates and should be corrected by the groups. GOC (should be GO_Central) AGBASE (Should be AgBase) FLYBASE (Should be FlyBase) GO_CENTRAL (Should be GO_Central) UniProtKB (Should be UniProt) ParkinsonsUK-UCL (should be UCL-Parkinsons) GDB (should be MaizeGDB) NTNU_SB (should be NTNU) WormBase (should be WB) GO_Noctua

kltm commented 5 years ago

tagging @vanaukenk

pgaudet commented 5 years ago

@dougli1sqrd Can you tell from which files these incorrect groups come ?

cmungall commented 5 years ago

Note for historic reasons, GOC and GO_Central are considered distinct. gorule-0000023 dictates that GOC is used as provenance for these inferences and is used in filtering by downstream consumers like @tonysawfordebi

This is obviously bad overloading, forced on us by GAF

For now I suggest having a distinct entry in groups.yaml for GOC and marking it as a legacy entry for this purpose.

cmungall commented 5 years ago

looks like the allcaps ones are coming from E-coli (checked in AmiGO, facet on contributor field)

tonysawfordebi commented 5 years ago

I believe that ParkinsonsUK-UCL and NTNU_SB are the correct attributions.

Can you confirm / deny @RLovering, @mlacencio?

mlacencio commented 5 years ago

@tonysawfordebi , I confirm that our attribution is NTNU_SB.

RLovering commented 5 years ago

Hi I am not sure what file needs correcting. Our annotations are listed as ParkinsonsUK-UCL, so this is my preferred option for the name.

Are you saying that in the file you are looking at our group is listed as both ParkinsonsUK-UCL and UCL-Parkinsons?

Thanks

Ruth

cmungall commented 5 years ago

groups.yaml has UCL-Parkinsons https://github.com/geneontology/go-site/blob/master/metadata/groups.yaml#L162-L164

so this should be changed to the one that is currently in use

mlacencio commented 5 years ago

Dear @dougli1sqrd ,

I have checked groups.yaml and I have found that the shorthand for NTNU is "NTNU". Could you please change it for "NTNU_SB"? This is our preferred option.

Thanks!

Marcio

dougli1sqrd commented 5 years ago

@pgaudet many of these come from several files.

@cmungall What should the URI for GOC be?

@mlacencio Sure thing, I'll make that change for NTNU.

I'll also make sure that UCL Parkinson's is ParkinsonsUK-UCL.

RLovering commented 5 years ago

Hi our website has changed a few pages but I am worried that I mess something up. So please could you do the following edits for me:

Change: label: "UCL-Parkinson's UK Annotation Project"

id: http://www.ucl.ac.uk/functional-gene-annotation/neurological/projects/parkinsons

shorthand: ParkinsonsUK-UCL

to label: "ParkinsonsUK-UCL"

id: https://www.ucl.ac.uk/functional-gene-annotation/neurological/completed-projects/completed-projects/parkinsons

shorthand: ParkinsonsUK-UCL

Change:

id: http://www.ucl.ac.uk/functional-gene-annotation/neurological/projects/tabs/aruk-ucl

to

id: https://www.ucl.ac.uk/functional-gene-annotation/neurological

and Change:

id: https://www.ucl.ac.uk/functional-gene-annotation/neurological/projects/tabs/syngo-ucl

to

id: https://www.ucl.ac.uk/functional-gene-annotation/neurological/syngo

Thanks

Ruth

dougli1sqrd commented 5 years ago

Hi Ruth, Just so we're on the same page, you want the ParkinsonsUK-UCL entry to have that as both the label and the shorthand? The label is for the full name of a group, not necessarily what you would use for IDs in an annotation, like you would for shorthand.