jianhong / dagLogo

Visualize significant conserved amino acid sequence pattern in groups based on probability theory
0 stars 1 forks source link

Error while fetching sequence and preparing proteome #2

Open hassanakthv opened 2 years ago

hassanakthv commented 2 years ago

Hi,

I was working with the dagLogo package for sometimes and everything worked smoothly, but suddenly I'm getting so many errors for a script that was working two months ago.

First, there was an error when I used the prepareProteome function (below) proteome <- prepareProteome("UniProt", species = "Rattus norvegicus") Error:

Downloading data for species: Rattus norvegicus trying URL 'http://www.uniprot.org/uniprot/?query=organism:10116&format=fasta' Error in download.file(url = url, destfile = tempFile, ...) : cannot open URL 'http://www.uniprot.org/uniprot/?query=organism:10116&format=fasta' In addition: Warning message: In download.file(url = url, destfile = tempFile, ...) : cannot open URL 'https://rest.uniprot.org/uniprotkb/query=organism:10116&format=fasta': HTTP status was '400 Bad Request'


I also tried "homo sapien" and got the same error. Anyway, I resolved this by downloading the fasta file from UniProt.

Second, which I'm still stuck with it, is for fetchSequence function:

seq <- fetchSequence(toupper(as.character(df$symbol)), type = "uniprotswissprot", anchorAA = as.character(df$anchor), anchorPos = as.character(df$peptides), proteome = proteome, upstreamOffset=7, downstreamOffset=7)

Error:

Error in rep(seq.int(nrow(dat)), lengths(anchorPos)) : invalid 'times' argument


Here is the df table:

A tibble: 3 × 3

symbol peptides anchor

1 A0A0G2K5E8 GPKGENGIVGPTGPVGAAGPSGPNGPPGPAGSRGDGGPPGMTGfPGAAGR f 2 A0A0G2K5E8 GSDGSVGPVGPAGPIGSAGPPGfPGAPGPKGELGPVGNPGPAGPAGPR f 3 A0A0G2K5E8 GAPGPDGNNGAQGPPGPQGVQGGKGEQGPAGPPGfQGLPGPSGTAGEVGKPGER f Can you help me with this? Thanks!
jianhong commented 2 years ago

Thank you for reporting this. The Uniprot changed their rest API. We need update our url format. Will keep you updated.

jianhong commented 2 years ago

Hi @hassanakthv ,

I updated the development version of dagLogo to fix this issue. You can try to install it via BiocManager::install("jianhong/dagLogo")

Let me know if it does not work for you.

Jianhong.

mengatron commented 1 year ago

Hello, I am trying to use dagLogo for a Motif analysis, and meet a error when running fetchSequence function.

The sequence length is 12 for all 59 peptides in data file dat3, here is my codes,

seq <- fetchSequence(toupper(as.character(dat3$ProteinID)), type="uniprotswissprot", anchorAA="*", anchorPos=as.character(dat3$M_Sequence), proteome = proteome, upstreamOffset=7, downstreamOffset=7)

Error: Error in rep(seq.int(nrow(dat)), lengths(anchorPos)) : invalid 'times' argument

I use the latest version of dagLogo, and also tried your development version BiocManager::install("jianhong/devtools"), but the error still occurs. Could you kindly help have a look? If you need more information please let me know. Thank you very much.

Best wishes,

Zhaowei

jianhong commented 1 year ago

@mengatron , Could you please share me the minimized sample code to repeat your error? Jianhong.

mengatron commented 1 year ago

Hello Jianhong, thank you for your help, here is a minimized sample code,

  | X | ProteinID | Sequence | M_Sequence | Lenght

1 | 4 | A0MZ66 | EQAIGEYEDLR | EQAIGEYEDLR | 12 2 | 20 | O00273 | KTETVQEACER | KTETVQEACER | 12 3 | 31 | O00273 | ASPPGDLQNPK | ASPPGDLQNPK | 12 4 | 32 | O00273 | ALAVALNWDIK | ALAVALNWDIK | 12 5 | 34 | O00273 | LQQTQSLHSLR | LQQTQSLHSLR | 12 6 | 45 | O60271 | GGETPGSEQWK | GGETPGSEQWK | 12 7 | 50 | O60271 | DVAGLDTEGSK | DVAGLD*TEGSK | 12

Note: there is a sign of to the right side of the sixth amino acid (anchorAA) for each sequence in column "M_Sequence", as shown for the last sequence DVAGLDTEGSK, but when I send my comment most of * disappeared except for the last one.

By the way, here is my code for creating the proteome,

proteome <- prepareProteome(fasta = system.file("extdata", "HUMAN.fasta", package = "dagLogo"), species = "Homo sapiens")

Please tell me if you need anything more, thanks.

Zhaowei