pombase / canto

The PomBase community curation tool
https://curation.pombase.org
Other
18 stars 7 forks source link

Add start of --change-gene-id admin action #2822

Closed kimrutherford closed 1 month ago

kimrutherford commented 1 month ago

Refs pombase/canto#2677

jseager7 commented 1 month ago

@kimrutherford I've added code to update the gene name in the allele name, and to update the gene name (rangeDisplayName) and primary identifier (rangeValue) in annotation extensions.

I've tested this with UniProtKB accession numbers in PHI-Canto and it seems to work fine, but you'll have to test it with PomBase / Chado as the gene source because I don't know whether my changes are safe to use with those.

Specifically, I'm assuming there will only ever be one result returned from UniProtKB in the $from_id_lookup_result and $to_id_lookup_result, which allows me to simplify the code for getting the old and new gene names to this:

  my $old_name = $from_id_lookup_result->{found}->[0]->{primary_name};
  my $new_name = $to_id_lookup_result->{found}->[0]->{primary_name};

But I don't know whether that assumption holds for PomBase, or if it even holds for UniProtKB. I guess a safer solution would be to iterate through the results and find the first result where the primary identifier matches the $from_id (then do the same for $to_id), then get the new gene name from that result. I couldn't figure out how to do this at the time though.

kimrutherford commented 1 month ago

Hi James.

Thanks very much for those changes. It all looks good to me. I'm going to merge the PR. We can added any fixes to the main branch.

But I don't know whether that assumption holds for PomBase, or if it even holds for UniProtKB.

It's not going to be a problem for PomBase because our lookup code will only return a single gene for a systematic ID.

I think it's OK for UniProt too since we're looking up accessions.

The web service used for the UniProt lookup is configured in canto.yaml:

webservices:
  uniprot_batch_lookup_url: 'https://rest.uniprot.org/uniprotkb/search?format=xml&query='
jseager7 commented 1 month ago

It all looks good to me. I'm going to merge the PR.

Great, thanks again for your help.