larssnip / micropan

R package for microbial pangenomics
21 stars 0 forks source link

How to deal with this error ? #3

Closed mylanhong closed 4 years ago

mylanhong commented 6 years ago

Hello, when I ran the R code below, an error appeared indiacting "cannot open the connection":

Calling genes by Prodigal

for( i in 1:dim(genome.table)[1] ){ cat("Predicting genes in", genome.table$File[i], "...\n") genome.file <- file.path("data/genomes", genome.table$File[i]) prot.file <- file.path("data/proteins", genome.table$File[i]) gff.table <- prodigal(genome.file, prot.file) }

Error in file(in.file, open = "rt") : cannot open the connection In addition: Warning messages: 1: running command 'prodigal -i data/genomes/Mpneumoniae M129.fsa -f gff -o prodigal.gff -q -a data/proteins/Mpneumoniae M129.fsa -c ' had status 15 2: In file(in.file, open = "rt") : cannot open file 'prodigal.gff': No such file or directory

How to deal with this error ?

khliland commented 6 years ago

Hi mylanhong. This code seems to be from the casestudy.pdf. Did all the code leading up to this particular section work without any warnings? Regards, Kristian

mylanhong commented 6 years ago

@khliland Thank you for your reply ! Yes, the code was from the casestudy.pdf. I am a newer for R software. I just wanted to step through the case study to better know about micropan package. All the code leading up to this particular section work without any warnings. Was it due to the prodigal software cannot be invoked by function prodigal ( ) ? However, when I ran '> system("prodigal -h")', a listing of the available options in prodigal appeared. That meant the prodigal software was properly installed, right ?

larssnip commented 6 years ago

The error from R is due to not being able to read the file prodigal.gff. This is again due to the Call to prodigal failing, and not creating the file prodigal.gff. The problem lies in the Call to prodigal.

  1. Does the subfolder data/proteins/ exist? If not create this and re-try
  2. Which Version of prodigal is used? Run system("prodigal -v") in the Console window of R. It should be Version 2.6.3 or newer
larssnip commented 6 years ago

I took a closer look and the problem might be the filenames used.

The filename Mpneumoniae M129.fsa has a space inside it? This will probably not be tolerated by prodigal, as it assumes the filename is Mpneumoniae, and that M129.fsa is some (unknown) option. Replace all Spaces by underscore: Mpneumoinae_M129.fsa. In fact, never use Spaces inside filnames, ever...

mylanhong commented 6 years ago

@larssnip Yes, it was the space inside the filename caused the error. Thanks a lot for your help. O(∩_∩)O

mylanhong commented 6 years ago

@larssnip Hello, larssnip, When I run the code from the casestudy.pdf on page 8:

in.files <- file.path("data/prepped", dir("data/prepped")) db <- "H:/Mpneumoniae/pHMM_database/Pfam-A.hmm/Pfam-A.hmm" # edit this to match your system out.folder <- "pfam" hmmerScan(in.files, db, out.folder)

The following warning and error appeared:

hmmerScan: Scanning data/prepped/Mpneumoniae_19294_GID7.fsa ... cygwin warning: MS-DOS style path detected: H:/Mpneumoniae/pHMM_database/Pfam-A.hmm/Pfam-A.hmm Preferred POSIX equivalent is: /Mpneumoniae/pHMM_database/Pfam-A.hmm/Pfam-A.hmm CYGWIN environment variable option "nodosfilewarning" turns off this warning. Consult the user's guide for more details about POSIX paths: http://cygwin.com/cygwin-ug-net/using.html#using-pathnames Error: Unrecognized format, trying to open hmm file H:/Mpneumoniae/pHMM_database/Pfam-A.hmm/Pfam-A.hmm for reading.

I am wondering if the format of the path of the database was wrong. It seemed the R software cannot identify the path.

larssnip commented 6 years ago

Is the path correct for Your system? It looks a little strange that Pfam-A.hmm is both a folder and then a file inside that folder? Should the path be only... H:/Mpneumoniae/pHMM_database/Pfam-A.hmm ...?

mylanhong commented 6 years ago

@larssnip I comfirmed the path was correct. The names of the folder and the file was the same. To avoid the misunderstanding, I palced the Pfam-A.hmm file in the Mpneumoniae folder, but the error appeared again:

in.files <- file.path("data/prepped", dir("data/prepped")) db <- "H:/Mpneumoniae/Pfam-A.hmm" # edit this to match your system out.folder <- "pfam" hmmerScan(in.files, db, out.folder)

hmmerScan: Scanning data/prepped/Mpneumoniae_19294_GID7.fsa ... cygwin warning: MS-DOS style path detected: H:/Mpneumoniae/Pfam-A.hmm Preferred POSIX equivalent is: /Mpneumoniae/Pfam-A.hmm CYGWIN environment variable option "nodosfilewarning" turns off this warning. Consult the user's guide for more details about POSIX paths: http://cygwin.com/cygwin-ug-net/using.html#using-pathnames

Error: Unrecognized format, trying to open hmm file H:/Mpneumoniae/Pfam-A.hmm for reading.

mylanhong commented 6 years ago

@larssnip Besides, after uncompressing the file Pfam-A.hmm.gz, I run hmmpress on the Pfam-A.hmm file, an error appeared:

C:\Users\dn> set path="H:\Mpneumoniae" C:\Users\dn> hmmerpress -f Pfam-A.hmm Error: Failed to open HMM file Pfam-A.hmm for reading.

This was done with cmd.exe on Windows operating systems.

mylanhong commented 6 years ago

@larssnip I did some Google searching, some argued the reason may be the HMMER version is not the same as my Pfam-A.hmm version. My HMMER version was HMMER 3.0 <MARCH 2010> while the Pfam-A database was downloaded from ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release.

The version of Pfam-A database ranges from 1.0 to 31.0 (except 2.0 and 3.0), but which version matches HMMER 3.0?

larssnip commented 6 years ago

Hmm, yes that sounds like the problem. As a quick fix you can use an older Version of Pfam, I Guess something like Version 27.0 or older? This is a HMMER/Pfam issue.