Closed vragh closed 3 years ago
Dear @vragh
Thank you very much for your interest in UniprotR, UniprotR::GetSequences
function retrieves the canonical sequence of entry based on its accession as we call the UniProt API using the exact accession entered by the user. We plan in the next update to handle canonical and isoforms as the current version doesn't support isoforms.
Hi Mohmed,
Really appreciate the quick response!!
I see. But what about cases where no canonical isoform is designated (this is the case for a lot of the non-SwissProt entries). I am not too familiar with what UniProt does here. But I presume GetSequences
just takes whatever the API designates as the default sequence?
(I suppose I should clarify this with the UniProt folks.)
Hi @vragh
Yes, I assume UniprotR::GetSequences
will return the first hit from API response but it will be good if you can share with me a list of accessions as we can try a real example.
Hi @MohmedSoudy
I was mistaken. It looks like unreviewed UniProt entries do not have multiple sequences assigned to them (at least as far as I can tell). So for those, GetSequences
will always retrieve the representative (and only) sequence. I guess that solves my problem.
But that said, an update enabling the user to specify retrieval of non-canonical isoforms would be super beneficial!
Yes @vragh
As I told you, UniprotR:: GetSequences
will retrieve the exact accession entered by the user & UniProt assigns unique accession for each entry. Hope our package makes your work easier.
@vragh Hi
We made an update, UniprotR now supports downloading sequence canonical and isoforms using GetSequenceIso
function.
Dear maintainers of
UniprotR
,Firstly, thank you for writing and maintaining this awesome
R
package. It really is a lifesaver!!I have a question about
UniprotR::GetSequences
. I see that in cases where a sequence has multiple isoforms onUniProt
,GetSequences
still only returns a single sequence (isoform). My question is, which isoform is this? Is it the canonical isoform (if a canonical isoform has been designated) or one chosen at random? How doesGetSequences
choose the isoform when no canonical isoform is designated?