pombase / canto

The PomBase community curation tool
https://curation.pombase.org
Other
19 stars 7 forks source link

Add start of --change-gene-id admin action #2822

Closed kimrutherford closed 7 months ago

kimrutherford commented 7 months ago

Refs pombase/canto#2677

jseager7 commented 7 months ago

@kimrutherford I've added code to update the gene name in the allele name, and to update the gene name (rangeDisplayName) and primary identifier (rangeValue) in annotation extensions.

I've tested this with UniProtKB accession numbers in PHI-Canto and it seems to work fine, but you'll have to test it with PomBase / Chado as the gene source because I don't know whether my changes are safe to use with those.

Specifically, I'm assuming there will only ever be one result returned from UniProtKB in the $from_id_lookup_result and $to_id_lookup_result, which allows me to simplify the code for getting the old and new gene names to this:

  my $old_name = $from_id_lookup_result->{found}->[0]->{primary_name};
  my $new_name = $to_id_lookup_result->{found}->[0]->{primary_name};

But I don't know whether that assumption holds for PomBase, or if it even holds for UniProtKB. I guess a safer solution would be to iterate through the results and find the first result where the primary identifier matches the $from_id (then do the same for $to_id), then get the new gene name from that result. I couldn't figure out how to do this at the time though.

kimrutherford commented 7 months ago

Hi James.

Thanks very much for those changes. It all looks good to me. I'm going to merge the PR. We can added any fixes to the main branch.

But I don't know whether that assumption holds for PomBase, or if it even holds for UniProtKB.

It's not going to be a problem for PomBase because our lookup code will only return a single gene for a systematic ID.

I think it's OK for UniProt too since we're looking up accessions.

The web service used for the UniProt lookup is configured in canto.yaml:

webservices:
  uniprot_batch_lookup_url: 'https://rest.uniprot.org/uniprotkb/search?format=xml&query='
jseager7 commented 7 months ago

It all looks good to me. I'm going to merge the PR.

Great, thanks again for your help.