Planteome / planteome-annotation-data

This is a place to discuss issues around the Planteome annotation data and store useful scripts etc.
1 stars 0 forks source link

Comma-separated synonym fields in GO GAF file #30

Closed serenalotreck closed 2 years ago

serenalotreck commented 2 years ago

I've been using the GAF files from Planteome to get the names and synonyms of database items, and I noticed that one of the GO files seems to have synonym fields that are separated by commas, rather than by pipe characters (|) like they're supposed to be.

The file I've noticed this in is go_gene_Orzya_Gramene.assoc. I was wondering if this was intentional/for some reason, or if this is a mistake?

Thanks!

elserj commented 2 years ago

@jaiswalp Can you please take a look at this file and let me know if these are multiple synonyms or one synonym with commas? I think they are multiple, but I can't find the source file on gramene and I want to confirm before I make any changes.

Also, should the quotes be removed?

If you confirm they are multiple, I will make the changes.

jaiswalp commented 2 years ago

go ahead and make changes

On 11/4/2021 9:40 AM, elserj wrote:

[This email originated from outside of OSU. Use caution with links and attachments.]

@jaiswalp https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjaiswalp&data=04%7C01%7CPankaj.Jaiswal%40oregonstate.edu%7C12be48be7046489bf09a08d99fb1c4c3%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C637716408121850216%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Nk%2FA5pjbdJEJnSch3uwwljTsUESDMIarjk68JIYP60A%3D&reserved=0 Can you please take a look at this file https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpalea.cgrb.oregonstate.edu%2Fviewsvn%2Fassociations%2Fgo-associations%2Fgo_gene_Oryza_Gramene.assoc%3Frevision%3D919%26view%3Dmarkup&data=04%7C01%7CPankaj.Jaiswal%40oregonstate.edu%7C12be48be7046489bf09a08d99fb1c4c3%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C637716408121850216%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=m%2BYVHa1mpYPoO7EWDWU9TC4aGpT3GufWFoTrHrNzaQ0%3D&reserved=0 and let me know if these are multiple synonyms or one synonym with commas? I think they are multiple, but I can't find the source file on gramene and I want to confirm before I make any changes.

Also, should the quotes be removed?

If you confirm they are multiple, I will make the changes.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FPlanteome%2Fplanteome-annotation-data%2Fissues%2F30%23issuecomment-961220114&data=04%7C01%7CPankaj.Jaiswal%40oregonstate.edu%7C12be48be7046489bf09a08d99fb1c4c3%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C637716408121860214%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=iK4Tlzpaf%2FWWGHp15xsLgbnn%2BsLzCNjNwJmV%2BZep3f8%3D&reserved=0, or unsubscribe https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABYJEQDM6NDCZPFFZD6VPCDUKLAPRANCNFSM5HL3UUTQ&data=04%7C01%7CPankaj.Jaiswal%40oregonstate.edu%7C12be48be7046489bf09a08d99fb1c4c3%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C637716408121870205%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=KJXuIdA9fmj4T8OAC%2FiTWlrsR6JVR0lPLgysobTnt3I%3D&reserved=0. Triage notifications on the go with GitHub Mobile for iOS https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7CPankaj.Jaiswal%40oregonstate.edu%7C12be48be7046489bf09a08d99fb1c4c3%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C637716408121870205%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HeTqO5SDZbJ5KBlFAdedcejftgVU4Q4C8vo9BA80wQ0%3D&reserved=0 or Android https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7CPankaj.Jaiswal%40oregonstate.edu%7C12be48be7046489bf09a08d99fb1c4c3%7Cce6d05e13c5e4d6287a84c4a2713c113%7C0%7C0%7C637716408121880203%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Rd7%2BhrWgTHyK%2FVGVAPIDaYmnYbxGbfexzh4sOAPMFDc%3D&reserved=0.

-- Pankaj Jaiswal, PhD Professor Department of Botany and Plant Pathology Oregon State University 4575 SW Research Way Corvallis, Oregon, 97333 USA

Ph(O): +1-541-737-8471 @.***

cooperl09 commented 2 years ago

Why are some of the gene names in square brackets? Also, looks like some of the synonyms are duplicated: https://browser.planteome.org/amigo/gene_product/GR_gene:GR:0060146

elserj commented 2 years ago

I don't know, I'm removing those as well. I believe once I get it done correctly, the duplicates will resolve automatically as part of the loading process in AmiGO. I will write a quick script to find and fix the dupes in the annotation file though.

elserj commented 2 years ago

This should be fixed now. Please let me know if you find any I missed.