pombase / allele_qc

Quality control for PomBase alleles
MIT License
1 stars 1 forks source link

Unexpected allele changes #49

Closed manulera closed 1 year ago

manulera commented 1 year ago

Hi @kimrutherford.

@ValWood has flagged some alleles that have been changed recently, and I am not sure why they have undergone certain changes.

This was a duplication, and the descrption of both alleles was D37A. The coordinates are incorrect, as they refer to the coordinates in a different organism, the correct is D90A. This seems to have been manually fixed by me here for isu1-D37A:

https://github.com/pombase/allele_qc/blob/8cc48d73fd06193037ba50151f59a668bde8f1d7/change_log/allele_manual_changes_formatted_19042023.tsv?plain=1#L33

However, isu1-D37 has also been fixed, even though it's not in the list of manual fixes. Perhaps I changed it in Canto by hand and forgot to rename the allele, but I was wondering if it could be the pipeline to ingest the changes into Canto instead. Does it only check for pairs of systematic id + description? In other words, if two alleles of the same gene have different names (isu1-D37A and isu1-D37), but same description (D37A), do both get changed in Canto, even if only one is mentioned in the fixing file (isu1-D37A)?

ValWood commented 1 year ago

I think this is OK actually. This is another case of an error which was hidden before but came to light because of the expression merges.

SPAC227.13c isu1-D37 32806b589efc50fd D90A isu1-D37A 413f9c27bbcf0d46

I will fix, but to avoid confusion should I rename as isu1-D90A and isu1-D37A as a synonym ? Maybe I should also add an allele comment to this file: pombe-embl/supporting_files/allele_comments.txt

In Canto the fixes to alleles need to be made independently (this is why we would like an "App" for global fixes)