Open kimrutherford opened 2 years ago
I've had a think and this might not be too tricky to implement once this issue is done:
For that issue we'll need to process the allele data into a more usable form for the website, to make the allele pages. Once that's done, it will be straightforward to write the current allele IDs, names etc. to a file. Then at the start of the next load we would read that allele file into Chado.
The idea would be that once an allele had an ID, it would always be passed along to next night's load. We would be initialising Chado with the previous alleles before we start loading from Canto and PHAF files.
There will still be a few changes to make in the loading code but mostly we'd just need to add a new loader for the existing allele data.
Note to self: the alleles will need to be loaded (just) after the contig files because the alleles reference genes.
OK great. Don't hurry, I can say in progress if we mention it at all.
Once that's done, it will be straightforward to write the current allele IDs, names etc. to a file.
Turns out that writing the allele details to a file was straightforward. The allele information was already collected in the correct format in memory during the Chado-to-website processing step. It was just a few lines of extra code to get useful file: https://curation.pombase.org/dumps/latest_build/misc/allele_summaries.json
I've had a think and this might not be too tricky to implement once this issue is done:
- pombase/website#1294
I've thought some more. It doesn't make sense to have allele pages until we have permanent identifiers so these two issues need to be completed in parallel.
The idea would be that once an allele had an ID, it would always be passed along to next night's load. We would be initialising Chado with the previous alleles before we start loading from Canto and PHAF files.
I think this is mostly solved. I've written a new loading step that reads the allele IDs and details from JSON file from the previous load. Since the ID and details will be pre-loaded, the allele IDs should stay the same from load to load.
I'm going to wait until Friday night before I activate the new step in case something goes wrong. My plan then is to check the load and compare with Thursday night's to make sure it worked OK.
I'm going to wait until Friday night before I activate the new step in case something goes wrong.
This is on hold until I fix: pombase/pombase-chado#992
Note to self: don't forgot to change the JaponicusDB build script to match any pombe changes.
(From: https://github.com/pombase/website/issues/1294#issuecomment-1218509827)
We need to think about what happens when an allele is renamed, or its description changes. We talked on Zoom about implementing a new tool (maybe within Canto) that will allow alleles to be renamed in all sessions at once. I've made an issue:
We also need to think about what happens when an allele is deleted but there is a plan for that: Chado has a "is_obsolete" field for each feature. If we set that to true for deleted alleles I think everything will work out.
Now waiting on:
I've done some test runs today and it's all in a better state than I remember. :-)
It will probably take quite a few nightly loads to get things right so I suggest that we disable Canto and public pombase.org updates on Thursday night. That will give me Friday to commit and check the code and config changes. And then I'll have the weekend to run lots of test loads until things are working.
Does that sounds OK?
I think this is the highest priority: https://curation.pombase.org/dumps/builds/pombase-build-2024-02-19/logs/log.2024-02-19-00-56-47.chado_checks.duplicated_allele_names
Could you look at this too when you get a chance?: https://github.com/pombase/curation/issues/3561
I'll double check the other log files before Friday to see if there is anything else that needs fixing urgently.
These should be fixed for tomorrow
These should be fixed for tomorrow
Thanks!
Hi Val.
There is still a duplicate allele name:
Check for two or more alleles with the same name - CHECK FAILURE: expected 0 but got 2
name uniquename description allele_type canto_session
prp3-3 SPAC29E6.02:allele-5 unknown unknown e8547aef6b97c8ef
prp3-3 SPAC3A12.11c:allele-3 unknown unknown 7d409d497eb075ca
Over the weekend I did a couple of full test loads on my desktop with the new Canto code and the new loading code. Everything seemed to work as planned.
Checklist for putting the new allele system into production:
etc/update_canto_alleles_from_chado.pl
etc/set_ids_from_chado.pl
Hi Val.
There are some Canto load errors that might be easier to fix before the allele systematic ID changes. Are these hard to fix?
On my to do list for this week.
I think prp3-1 is an allele of cwf2 but need to double check this
fixed prp3-1
I think I have eventually cleared this log...will check tomorrow.....
another go, hopefully tomorrow...
it worked, but I have new ones, will fix today!
I've now merged this code into the master branch (locally). It was a bit painful because there were conflicts with the code changes from pombase/canto#2544.
I've also merged the changes into the test Canto so we can test things: https://curation.pombase.org/test/curs/4666975359de04dd/genotype_manage
Note to self: branch issue-2758-disable-edits-for-existing-alleles-merged
I've now merged this code into the master branch (locally).
Sorry, that wasn't clear. I've merged the code for preventing allele changes in Canto when you select an existing allele: pombase/canto#2758
We found that we could not edit any annotation for an existing allele in any annotation where the allele was used in a genetic interaction, even if that annotation was not used in the interaction.
e.g.
Let me know if you want to discuss.
CC @PCarme
For example in the same session I want to change this evidence code to microscopy but I can't. I can't do it by copy/edit delete either because delete is also greyed out
We found that we could not edit any annotation for an existing allele in any annotation
I don't think this is due to the allele identifier changes. It's because of: pombase/canto#2740
I've pasted your comment into that issue.
See also: