hputnam / Geoduck_Meth

DNA methylation of geoduck exposed to ocean acidification treatments
1 stars 0 forks source link

Confirm origins of Panopea-generosa-genes-annotations.tab #16

Closed shellywanamaker closed 3 years ago

shellywanamaker commented 4 years ago

@sr320 can you confirm the origins of this file that is on OFS here https://osf.io/4etsu/ ?

related to https://github.com/RobertsLab/resources/issues/764 and https://github.com/hputnam/Geoduck_Meth/issues/13

shellywanamaker commented 3 years ago

I think this may be here somewhere https://github.com/sr320/nb-2020/tree/master/P_generosa

shellywanamaker commented 3 years ago

@kubu4 i think Panopea-generosa-genes-annotations.tab came from Panopea-generosa-v1.0.a4.gene.GOids.tab which I think came from you Nov 26 2019 GenSAS analysis?

kubu4 commented 3 years ago

Nov 26 2019 GenSAS analysis?

I don't have a notebook entry on this date for any GenSAS stuff. Can you elaborate on this when you have a sec?

kubu4 commented 3 years ago

Can you elaborate on this when you have a sec?

Never mind, I know what's going on here. That date is when that job in the GenSAS pipeline was created. I think the full GenSAS analysis completed in Dec. 2020.

kubu4 commented 3 years ago

i think Panopea-generosa-genes-annotations.tab came from Panopea-generosa-v1.0.a4.gene.GOids.tab which I think came from you Nov 26 2019 GenSAS analysis?

That file was not generated by GenSAS. The header indicates it is a GFF, but the file is not a GFF, and GenSAS never produced a file like that.

shellywanamaker commented 3 years ago

must be a file that resulting from a join of the GFF you made with GO annotations downloaded from Uniprot, like @sr320 did here https://github.com/sr320/nb-2019/blob/master/P_generosa/32-Gene-GO-Annotations.ipynb

kubu4 commented 3 years ago

I'm fairly certain it was generated by @sr320.

kubu4 commented 3 years ago

Explanation is the file was generated using GenSAS BLAST table results from intermediate gene models. Final gene annotations are magically determined by GenSAS evaluating BLAST results from multiple intermediate gene models.