hepcat72 / CFF

Cluster-free Filtering. Determine which sequences are real in a metagenomic sample.
GNU General Public License v3.0
9 stars 1 forks source link

getReals.pl Error: Unable to open input file: [/tmp/global_library.fna.1471352954.reals.tmp.fna]. #7

Closed TanishaSH closed 8 years ago

TanishaSH commented 8 years ago

Hello, Could you explain me this error? As I understand this input file created during working script as temporary files, and what can to prevent it? getReals.pl -i 'all_samles//3cands/.drp.fna.lib.n0s.cands' -n 'all_samles//2n0s/.drp.fna.lib.n0s' -f 'all_samles//1_lib/global_library.fna' -k 2 --outdir 'all_samles//4_reals_table' -t fasta

hepcat72 commented 8 years ago

Hi Tanisha,

Thanks for the report. First, let me provide some context. The main purpose of the getReals.pl script is to pick candidate sequences from the various sample files which were "nominated" by the getCandidates.pl script. Candidates are picked by the fact that the same sequence is present in at least -k candidate files (candidate files are provided using -i). The secondary purpose of the script is to filter chimeric sequences.

In this instance, your command has set -k to 2. That's the default in the pipeline scripts. This means that there must be at least 2 candidate files provided (using -i) to the script in order for it to provide any output. Your command provides only 1 candidate file (all_samles//3_cands/.drp.fna.lib.n0s.cands). The script is supposed to catch these instances, halt execution, and issue the following error:

ERROR1: Too few candidates files (-i) supplied. The number of minimum candidacies (-k): [2] requires at least as many files supplied to each of the -i and -n (backwards-compatible with -d) options. I.e. -i requires at least [2] files and -n requires at least [2] files. If you only have 1 of each file, then this script should not be applied unless you set -k to 1, which will allow you to filter for chimeras at least.

I don't know how you got the open error from getReals.pl because when I run your command with simulated files, all I get is the above error and nothing else.

Do you have 1 sample file of sequences that you are processing or do you have multiple samples all in one file?

The only thing I can figure is that the error-catching code could have been commented out? If that's the case, the open error you're getting relates to a file created by the second of 2 usearch (i.e. "uchime") commands which are used for filtering chimeric sequences. The error could relate to your ability to write to the /tmp directory, or something happening in the uchime executable itself. There are ways to debug that, but I suggest you first address the issue with the -k and -i requirements.

Rob

On Aug 16, 2016, at 9:56 AM, TanishaSH notifications@github.com wrote:

Hello, Could you explain me this error? As I understand this input file created during working script as temporary files, and what can to prevent it? getReals.pl -i 'all_samles//3_cands/.drp.fna.lib.n0s.cands' -n 'all_samles//2_n0s/.drp.fna.lib.n0s' -f 'all_samles//1_lib/global_library.fna' -k 2 --outdir 'all_samles//4_reals_table' -t fasta

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

TanishaSH commented 8 years ago

Hi, Rob,

Thank you for your help. Yes, I have mistake in the text the true is this: getReals.pl -i 'all_samles//3_cands/.drp.fna.lib.n0s.cands' -n 'all_samles//2_n0s/.drp.fna.lib.n0s' -f 'all_samles//1_lib/global_library.fna' -k 2 --outdir 'all_samles//4_reals_table' -t fasta

and I do this to try to catch the error, and before that, I run the standart pipline: bash run_CFF_on_FastQ.tcsh 250 0.14 33 all_samles/ "*.fastq"

hepcat72 commented 8 years ago

Hmmm... Perhaps there's a problem with an email app stripping out characters? It still appears as 1 input candidates file. But there must be more if you're getting the open error. So in that case, here's what I think could be going on...

The temporary file did not get created. There are a couple reasons this could happen. Either (1) there was a problem with access to the temporary directory (/tmp) or (2) there was a problem with your usearch (i.e. uchime) installation.

  1. getReals.pl defaults to the temporary directory in your TMPDIR environment variable. It tries a couple other variations of that variable if it doesn't exist or if the directory doesn't exist. If no variable is found, it sets '/tmp' if it is an existing directory. However, looking at the code, I see there's no check on whether the user has write access to the tmp directory, so it's possible you might not have write permission to that directory.
  2. The temporary file is created by the usearch ("uchime") executable, namely from a command like this:

$uchime_exe -uchime_denovo $cands_with_global_abund -minuniquesize 2 -nonchimeras $tmp_out_file $aln_arg -quiet

uchime is a noisy executable, and getReals has its own structured verbose output, so the -quiet flag is used, but that could have suppressed an error coming from uchime regarding the execution on your files, resulting in no temporary output file created (and thus the open error you received).

I suggest trying 2 things:

First, try creating a file in /tmp by executing this command:

touch /tmp/test

If you get an error about permission, then that's the issue. To fix this, you can either make /tmp writable or set the TMPDIR environment variable in your shell to a directory that you do have write permission to.

Second, if you have write permission to /tmp, my bet would be an issue coming from uchime. Your usearch installation is likely good because the script checks the installation when it runs and would have issued a more specific error if there was a problem with it. To see what uchime has to say about your run, we need to do 2 things:

• Add the --overwrite and --verbose flag to the getReals call that you pasted into your email:

getReals.pl -i 'all_samles//3_cands/.drp.fna.lib.n0s.cands' -n 'all_samles//2_n0s/.drp.fna.lib.n0s' -f 'all_samles//1_lib/global_library.fna' -k 2 --outdir 'all_samles//4_reals_table' -t fasta --overwrite --verbose

When that runs, it will, among other things, output the exact usearch commands that it runs. There are 2 of them. The file in question is created in the second call, but it could be a problem with either command. Since their verbose output is quieted when the script runs, to get feedback on what they may be complaining about, you'll have to:

• Execute the usearch commands from the verbose output (without the -quiet flag).

Then you can address whatever issues you find there.

Let me know what you find.

Rob

On Aug 16, 2016, at 12:07 PM, TanishaSH notifications@github.com wrote:

Hi, Rob,

Thank you for your help. Yes, I have mistake in the text the true is this: getReals.pl -i 'all_samles//3_cands/.drp.fna.lib.n0s.cands' -n 'all_samles//2_n0s/.drp.fna.lib.n0s' -f 'all_samles//1_lib/global_library.fna' -k 2 --outdir 'all_samles//4_reals_table' -t fasta

and I do this to try to catch the error, and before that, I run the standart pipline: bash run_CFF_on_FastQ.tcsh 250 0.14 33 all_samles/ "*.fastq"

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

hepcat72 commented 8 years ago

I have issued a couple package updates in order to better catch and describe (or possibly even fix) the possible errors you could be encountering. You can update your installation by cd'ing into the CFF (or CFF-master) directory and issuing these commands:

perl Makefile.PL make sudo make install

Rob

On Aug 16, 2016, at 12:07 PM, TanishaSH notifications@github.com wrote:

Hi, Rob,

Thank you for your help. Yes, I have mistake in the text the true is this: getReals.pl -i 'all_samles//3_cands/.drp.fna.lib.n0s.cands' -n 'all_samles//2_n0s/.drp.fna.lib.n0s' -f 'all_samles//1_lib/global_library.fna' -k 2 --outdir 'all_samles//4_reals_table' -t fasta

and I do this to try to catch the error, and before that, I run the standart pipline: bash run_CFF_on_FastQ.tcsh 250 0.14 33 all_samles/ "*.fastq"

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

TanishaSH commented 8 years ago

Rob, Thank you very match! I used your tips and check access to /tmp directory and this error isn't caused by access.permission. So, I reinstall all scripts and now gerReals.pl works correct) But yet I know that I should to check, if error will appear again. But I have same errors in other steps, like as Indelfilter. Try to correct it by myself, if I can't do it, I also will write you) Thank for quickly answers!!

hepcat72 commented 8 years ago

If /tmp was writable, nothing I did would have fixed the getReals.pl script because the only fix I did was to check whether the temp directory is writable... unless you had a much older version of CFF and some update in there fixed this issue.

Regardless, glad it's getting further. I'll close this issue.