pombase / pombase-chado

PomBase code for accessing Chado
MIT License
5 stars 3 forks source link

prp10-1 allele issue #1142

Closed ValWood closed 9 months ago

ValWood commented 9 months ago

I am sure that I fixed this a while back (Aliana WIllet figure it out IIRC).

However it is still in @manulera log here: https://github.com/pombase/allele_qc/tree/master/results https://github.com/pombase/allele_qc/blob/master/results/allele_cannot_fix_sequence_errors.tsv

It is correct in Canto curs/24a3853610543a99/genotype_manage#/select/5

Screenshot 2024-02-21 at 17 25 27

but wrong on the website https://www.pombase.org/genotype/prp10-1-A1089V,S1097F-amino_acid_mutation-expression-not_assayed_or_wild_type

Screenshot 2024-02-21 at 17 28 57

The only explanation I could think of is that it is also incorrectly named in another session, but
then it would appear in

https://curation.pombase.org/dumps/latest_build/logs/log.2024-02-20-22-18-23.chado_checks.duplicate_allele_descriptions

any ideas?

ValWood commented 9 months ago

Yep it was fixed on 11 jan?

kimrutherford commented 9 months ago

It is correct in Canto curs/24a3853610543a99/genotype_manage#/select/5

but wrong on the website

Canto and the website look the same to me?

prp10-1 is in a few sessions. The descriptions vary:

https://curation.pombase.org/pombe/curs/e8547aef6b97c8ef/genotype_manage/ro https://curation.pombase.org/pombe/curs/24a3853610543a99/genotype_manage/ro https://curation.pombase.org/pombe/curs/1088ee3f2681b5d2/genotype_manage/ro https://curation.pombase.org/pombe/curs/79cfb1652794d9bc/genotype_manage/ro https://curation.pombase.org/pombe/curs/7d409d497eb075ca/genotype_manage/ro

The only explanation I could think of is that it is also incorrectly named in another session, but then it would appear in

https://curation.pombase.org/dumps/latest_build/logs/log.2024-02-20-22-18-23.chado_checks.duplicate_allele_descriptions

I don't understand that bit.

ValWood commented 9 months ago

I don't understand that bit.

So if the descriptions vary, shouldn't this be reported in a log (not duplicate descriptions, same description, different names) https://curation.pombase.org/dumps/latest_build/logs/log.2024-02-20-22-18-23.chado_checks.duplicated_allele_names

ValWood commented 9 months ago

@manulera After this, the only ones to fix are fin1-KD loh4-1 wtf18-2D366NN I am waiting for the authors for updates on these. I might change them to "other" with anote that the reported residues are incorrect.

kimrutherford commented 9 months ago

So if the descriptions vary, shouldn't this be reported in a log (not duplicate descriptions, same description, different names)

Ah, right. Sorry I misread what you wrote.

Duplicates loaded from Canto are caught earlier: https://curation.pombase.org/dumps/builds/pombase-build-2024-02-21/logs/log.2024-02-20-22-18-23.curation-tool-data-load-output

The chado_checks.duplicated_allele_names log file reports those that sneak into Chado in other ways, like from PHAF files.

ValWood commented 9 months ago

I always forget that. I was looking only in the files with "allele" in the name. I will do this log tomorrow!

ValWood commented 9 months ago

I'd like to chat about this. I keep doing the fixes, but then the next day more errors appear. For example, I have been fixing tor2-ts10 for weeks (I probably cleared all of the visible errors for this one at least 10 times) . I fix everything in the logs and the next day there are more, so somehow not every error is reported.

I don't think we need to change the behaviour, because once everything is fixed there should only be odd cases. But, could you do a chado query and locate everything with differences for

tor2-ts10 and prp10-1

kimrutherford commented 9 months ago
session name description type expression
12c854f8bde1a0cd tor2-ts10 A1398E,F2198L amino acid substitution(s) Not assayed
4847e0de3cb01075 tor2-ts10 A1399E,F2198L amino acid substitution(s) Not assayed
4847e0de3cb01075 tor2-ts10 A1398E,F2198L amino acid substitution(s) Not assayed
5bef3e6a63b5bcf4 tor2-ts10 A1398E,F2198L amino acid substitution(s) Not assayed
8589306346786b20 tor2-ts10 A1398E,F2198L amino acid substitution(s) Not assayed
b414bd2e9981203d tor2-ts10 A1398E,F2198L amino acid substitution(s) Not assayed
f07cd572b2d0adcf tor2-ts10 A1398E,F2198L amino acid substitution(s) Not assayed
session name description type expression
1088ee3f2681b5d2 prp10-1 A1050V,S1058F amino acid substitution(s) Not assayed
24a3853610543a99 prp10-1 A1050V,S1058F amino acid substitution(s) Not assayed
79cfb1652794d9bc prp10-1 A1089V,S1097F amino acid substitution(s) Not assayed
79cfb1652794d9bc prp10-1 A1050V,S1058F amino acid substitution(s) Not assayed
7d409d497eb075ca prp10-1 A1050V,S1058F amino acid substitution(s) Not assayed
e8547aef6b97c8ef prp10-1 A1050V,S1058F amino acid substitution(s) Not assayed
ValWood commented 9 months ago

hopefully this time

kimrutherford commented 9 months ago

I think this is fixed now.

ValWood commented 9 months ago

happy days!