michaelgruenstaeudl / PACVr

Plastome Assembly Coverage Visualization in R
Other
3 stars 4 forks source link

Error with the genbank file #3

Closed Silverfoxcome closed 4 years ago

Silverfoxcome commented 4 years ago

Hi!!! Now the script gives me this error message:

Error in genbankr::readGenBank(gbkFile) : file does not appear to an existing file or a GBAccession object. Valid file or text argument is required.

But I'm giving it the example genbank file with only the name and the sequence length changed.

edit: I'm giving the function the same example genbank file but still, it doesn't reconize it:

gbkFile <- system.file("extdata", "INIA601/MH161174.gb", package="PACVr")

it gives me the same error message :O

What can be wrong?

Thank you in advance

nilsj9 commented 4 years ago

Hi @Silverfoxcome, it seems that your path

"INIA601/MH161174.gb"

is incorrect. Use gbkFile <- system.file("extdata", "MH161174/MH161174.gb", package="PACVr") instead.

Silverfoxcome commented 4 years ago

Hi @Silverfoxcome, it seems that your path

"INIA601/MH161174.gb"

is incorrect. Use gbkFile <- system.file("extdata", "MH161174/MH161174.gb", package="PACVr") instead.

Hi! Yeah, at first, I did:

gbkFile <- system.file("INIA601, "INIA601/INIA601.gb", package="PACVr")

But it gave the error message:

Error in genbankr::readGenBank(gbkFile) : file does not appear to an existing file or a GBAccession object. Valid file or text argument is required.

I didn't know what was wrong with my genbank file so, instead, I moved the example genbank file (MH161174.gb) to my INIA601 folder to see what happened:

gbkFile <- system.file("INIA601", "INIA601/MH161174.gb", package="PACVr")

but it also give me this error message:

Error in genbankr::readGenBank(gbkFile) : file does not appear to an existing file or a GBAccession object. Valid file or text argument is required.

But that was the example genbank, it works in its folder but not here :O

Thank you so much for answering my questions!

nilsj9 commented 4 years ago

Try to specify your absolute path to your gbk file without using "system.file()". For example: gbkFile <- "path/to/your/file/INIA601/INIA601.gb"

Silverfoxcome commented 4 years ago

Thank you so much for your answer!!

I gave the absolute paths:

library(PACVr)

gbkFile <- "/media/koalaz/malazan_store/tesis/PACVr/extdata/INIA601/INIA601.gb"

bamFile <- "/media/koalaz/malazan_store/tesis/PACVr/extdata/INIA601/INIA601.sorted.bam"

outFile <- paste(tempdir(), "/INIA601_AssemblyCoverage_viz.pdf", sep="")

PACVr.complete(gbk.file=gbkFile, bam.file=bamFile, windowSize=250, mosdepthCmd='/media/koalaz/malazan_store/bio_tools/bin/mosdepth', threshold=15, delete=TRUE, output=outFile)

And I think it almost worked!!!

Now it gives me this error message:

Error in base::strsplit(x, ...) : non-character argument

But at least my gb file was read ToT

Silverfoxcome commented 4 years ago

The problem seems to lie with my genbank file because I gave the example gb file....

library(PACVr)

gbkFile <- "/media/koalaz/malazan_store/tesis/PACVr/extdata/INIA601/MH161174.gb"

bamFile <- "/media/koalaz/malazan_store/tesis/PACVr/extdata/INIA601/INIA601.sorted.bam"

outFile <- paste(tempdir(), "/INIA601_AssemblyCoverage_viz.pdf", sep="")

PACVr.complete(gbk.file=gbkFile, bam.file=bamFile, windowSize=250, mosdepthCmd='/media/koalaz/malazan_store/bio_tools/bin/mosdepth', threshold=15, delete=TRUE, output=outFile)

and it worked!

nilsj9 commented 4 years ago

Your GenBank Flat File must satisfy the requirements of a strict GenBank Flat File format, otherwise errors may occur. In addition, the parser of the 'genbankr' package currently in use seems to have problems with the Feature field 'exon'. To work around this error, you should change 'exon' features to 'CDS' or 'gene' feature. Sorry for that - we are currently working on a better solution for that. An FAQ will be added in the next days.

Silverfoxcome commented 4 years ago

Your GenBank Flat File must satisfy the requirements of a strict GenBank Flat File format, otherwise errors may occur. In addition, the parser of the 'genbankr' package currently in use seems to have problems with the Feature field 'exon'. To work around this error, you should change 'exon' features to 'CDS' or 'gene' feature. Sorry for that - we are currently working on a better solution for that. An FAQ will be added in the next days.

It worked!!!!!!!!!!! INIA601_AssemblyCoverage_viz.pdf

I edited my genbank file taking as example the example genbank file and taking into account what you said here and also what you wrote in the article about PACVr:

and all sequence features of class ‘exon’ to be removed. To be suitable for PACVr, the sequence record of the GenBank file must represent a complete, quadripartite plastid genome, with a total sequence length between 100 kb and 200 kb and features annotations for each of the two IRs (with note-qualifiers that have the text values ‘IRa’ and ‘IRb’, respectively).

Nothing to be sorry at all!!!! On the contrary!!!! Thank you so much for all your guide!!! This is for my final undergraduate work so it was very important to me :D !!!!

Thank you so much for this great Plastome Visualization tool!!!!

DelaFuenteJavier commented 3 years ago

Hi to all. Im pretty new in R and in this community! So sorry in advance if im breaking any rule.

I found a similar problem. My intention is to import a GBK file in order to create a TxDb object. So what i do is the following:

smpfile = system.file("mypath/myfile.gbk", package="genbankr",mustWork = TRUE)
gb = readGenBank(smpfile)
tx = makeTxDbFromGenBank(gb)

But when I try to create the smpfile I get the following msg: "Error in system.file("mypath/myfile.gbk", : no file found "

I first thought the problem was in my GBK file but I downloaded other gbk files and it still does not work. I do not have "exon", and the total length of the gbk file is 65 kb Do you have any clue about what's going on?

Thx in advance!

nilsj9 commented 3 years ago

Hi @deLFH. In Order to import a GBK file you have to specify the full path to your file.

gbkFile <- "path/to/my/file.gbk"
gb <- genbankr::readGenBank(gbkFile)
tx <- genbankr::makeTxDbFromGenBank(gb)
DelaFuenteJavier commented 3 years ago

Hi @nilsj9 nilsj9! Thx for the quick answer! After following your suggestions I still got the following error:

Error in genbankr::readGenBank(gbkFile) : file does not appear to an existing file or a GBAccession object. Valid file or text argument is required.

I attach the gbk file I am using (which now I think it may be the problem) myfile.txt Note that I use it as a .gbk but I changed it to .txt just to be able to upload it here.

Thanks again!

michaelgruenstaeudl commented 3 years ago

Hi @deLFH , the file you posted does not conform to a valid GenBank flatfile. It neither has a source feature, nor does it have any gene qualifiers, among other issues. The parsing module genbankr consequently does not identify your file as a "valid file". Please see examples of valid GenBank flatfiles under _PACVr/inst/extdata/NC045072/ . Best, Michael

michaelgruenstaeudl commented 3 years ago

On a different note, please open a new issue (with a fresh title) when you have a question and do not re-open an already closed issue. That makes life easier for us in helping users. Thanks.

DelaFuenteJavier commented 3 years ago

You are right and I am sorry I already opened a new issue. Thanks again for your patience