Closed spabinger closed 7 years ago
Hi Stephan,
It looks like MetaMeta couldn't find the sequence files on the following paths:
custom_fungi_viral_db/kaiju/*.gbff
custom_fungi_viral_db/clark_dudes/*.fna
Any path or file in the .yaml file is always relative to the working directory set on the workdir variable (in this case /home/stephan/work/epityp/mirnaseq/05_metameta/output/
). You can move the folder custom_fungi_viral_db
to this path or use the complete path to indicate where are the sequences.
Best, Vitor
Hi,
I could now build the databases but during the first run I get this error:
Error in job kaiju_db_custom_2 while creating output files /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.bwt, /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi
_viral_db/kaiju_db/kaiju_db.sa.
RuleException:
CalledProcessError in line 17 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm:
Command 'mkbwt -n 5 -a ACDEFGHIKLMNPQRSTVWY -nThreads 12 -o /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_vir
al_db/kaiju_db/kaiju_db.faa > /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_2.log 2>&1' returned non-zero exit status 5
File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm", line 17, in __rule_kaiju_db_custom_2
File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Job failed, going on with independent jobs.
.[33mUnable to set utime on symlink sample_name_1/reads/pre1.1.fq.gz. Your Python build does not support it..[0m
Unable to set utime on symlink sample_name_1/reads/pre1.1.fq.gz. Your Python build does not support it.
1 of 19 steps (5%) done
rule errorcorr_reads:
input: sample_name_1/reads/pre1.1.fq.gz
output: sample_name_1/reads/pre2.1.fq.gz
log: sample_name_1/log/errorcorr_reads.log
benchmark: sample_name_1/log/errorcorr_reads.time
wildcards: sample=sample_name_1
threads: 12
Thanks in advance, Stephan
Hi Stephan,
Please send me the following log files so I can help you figure out what was the problem:
/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_1.log
/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_2.log
Vitor
Hi,
kaiju_db_custom_2.log Is empty
kaiju_db_custom_2.log
# infilename= /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa
# outfilename= /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db
# Alphabet= ACDEFGHIKLMNPQRSTVWY
# nThreads= 12
# length= 0.000000
# checkpoint= 5
# caseSens=OFF
# revComp=OFF
# term= *
# revsort=OFF
# help=OFF
readFasta: No sequences read
kaiju_db_custom_4.log (older than kaiju_db_custom_2.log)
/home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_4.log 801/801 100%
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 38.5M 0 43440 0 0 22532 0 0:29:53 0:00:01 0:29:52 22531
19 38.5M 19 7643k 0 0 2639k 0 0:00:14 0:00:02 0:00:12 2639k
49 38.5M 49 19.0M 0 0 5072k 0 0:00:07 0:00:03 0:00:04 5072k
67 38.5M 67 26.1M 0 0 5546k 0 0:00:07 0:00:04 0:00:03 5545k
87 38.5M 87 33.7M 0 0 5912k 0 0:00:06 0:00:05 0:00:01 7039k
100 38.5M 100 38.5M 0 0 6044k 0 0:00:06 0:00:06 --:--:-- 8568k
nodes.dmp
Best, Stephan
Hi Stephan,
Looks like there's still a problem with the location of the files.
Kaiju database will first convert those files (*.gbff from the path you set on the yaml file for kaiju custom database) to the file kaiju_db.faa
(kaiju_db_custom_1) and then the database kaiju_db.bwt
and kaiju_db.sa
(kaiju_db_custom_2) and finally the index kaiju_db.fmi
(kaiju_db_custom_3).
It looks like the first step is not working and the kaiju_db.faa is empty. Do you have one or more .gbff (not gzipped) files in the directory custom_fungi_viral_db/kaiju/
? Notice that they are different from the files for the other tools (.fna).
If yes, can you check if the file /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa
is really empty?
You can get an example of such files and custom database creation in the sample data provided with the tool.
Vitor
Hi,
custom_fungi_viral_db/kaiju
~/work/epityp/mirnaseq/05_metameta/databases/custom_fungi_viral_db/kaiju$ ls -l
total 7914969
-rw-rw-r-- 1 stephan stephan 13240120830 Sep 12 09:47 fungi_viral_genomes.gbff
new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa
~/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db$ ls -l
total 23238
-rw-rw-r-- 1 stephan stephan 0 Sep 13 08:32 kaiju_db.faa
-rw-r--r-- 1 stephan stephan 111923416 Sep 13 08:32 nodes.dmp
Thanks, Stephan
Hi Stephan,
I've been trying to re-create your error and I couldn't, it runs fine in my configurations. However I noticed that the bioconda version of kaiju 1.0 installation is missing some perl runtime libraries.
Details aside, you can try to replace your file /home/stephan/work/miniconda/data/envs/py35/opt/metameta/envs/kaiju.yaml
for this one kaiju.yaml. That may solve the problem.
Best, Vitor
Hi,
I tried to running the pipeline with the new kaiju.yaml file, but I'm still getting an error:
Error in job kaiju_db_custom_2 while creating output files /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.bwt, /home/stephan/work/epityp/mirnaseq/05_metam
eta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.sa.
RuleException:
CalledProcessError in line 17 of /home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm:
Command 'mkbwt -n 5 -a ACDEFGHIKLMNPQRSTVWY -nThreads 12 -o /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db /home/stephan/work/epityp/mirnaseq/05_metameta/
databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa > /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/log/kaiju_db_custom_2.log 2>&1' returned non-zero exit status 5
File "/home/stephan/work/miniconda/data/envs/py35/opt/metameta/tools/kaiju_db_custom.sm", line 17, in __rule_kaiju_db_custom_2
File "/home/stephan/work/miniconda/data/envs/py35/lib/python3.5/concurrent/futures/thread.py", line 55, in run
Job failed, going on with independent jobs.
.[33mUnable to set utime on symlink sample_name_1/reads/pre1.1.fq.gz. Your Python build does not support it..[0m
Unable to set utime on symlink sample_name_1/reads/pre1.1.fq.gz. Your Python build does not support it.
1 of 19 steps (5%) done
rule errorcorr_reads:
input: sample_name_1/reads/pre1.1.fq.gz
output: sample_name_1/reads/pre2.1.fq.gz
log: sample_name_1/log/errorcorr_reads.log
benchmark: sample_name_1/log/errorcorr_reads.time
wildcards: sample=sample_name_1
threads: 12
.[33mUnable to set utime on symlink sample_name_1/reads/pre2.1.fq.gz. Your Python build does not support it..[0m
Unable to set utime on symlink sample_name_1/reads/pre2.1.fq.gz. Your Python build does not support it.
2 of 19 steps (11%) done
rule subsample_reads:
input: sample_name_1/reads/pre2.1.fq.gz
output: sample_name_1/reads/kraken.1.fq, sample_name_1/reads/motus.1.fq, sample_name_1/reads/gottcha.1.fq, sample_name_1/reads/kaiju.1.fq, sample_name_1/reads/clark.1.fq
log: sample_name_1/log/subsample_reads.log
benchmark: sample_name_1/log/subsample_reads.time
wildcards: sample=sample_name_1
.[33mUnable to set utime on symlink sample_name_1/reads/kraken.1.fq. Your Python build does not support it..[0m
.[33mUnable to set utime on symlink sample_name_1/reads/gottcha.1.fq. Your Python build does not support it..[0m
.[33mUnable to set utime on symlink sample_name_1/reads/motus.1.fq. Your Python build does not support it..[0m
.[33mUnable to set utime on symlink sample_name_1/reads/kaiju.1.fq. Your Python build does not support it..[0m
.[33mUnable to set utime on symlink sample_name_1/reads/clark.1.fq. Your Python build does not support it..[0m
Unable to set utime on symlink sample_name_1/reads/kraken.1.fq. Your Python build does not support it.
Unable to set utime on symlink sample_name_1/reads/motus.1.fq. Your Python build does not support it.
Unable to set utime on symlink sample_name_1/reads/gottcha.1.fq. Your Python build does not support it.
Unable to set utime on symlink sample_name_1/reads/kaiju.1.fq. Your Python build does not support it.
Unable to set utime on symlink sample_name_1/reads/clark.1.fq. Your Python build does not support it.
3 of 19 steps (16%) done
Do you have any other ideas?
Thanks, Stephan
Hi Stephen,
Did you delete the file /home/stephan/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa
before running the pipeline with the updated kaiju.yaml file?
The error is happening in a very simple step, where kaiju converts the .gbff to .faa and update the headers for further steps (with the script gbk2faa.pl). You can download this script to check if it works as an standalone, which would imply some other error in the pipeline:
./gbk2faa.pl ~/work/epityp/mirnaseq/05_metameta/databases/custom_fungi_viral_db/kaiju/fungi_viral_genomes.gbff ~/work/epityp/mirnaseq/05_metameta/databases/new_custom_fungi_viral_db/kaiju_db/kaiju_db.faa
This code is exactaly what MetaMeta is failing to execute.
Besides that your .gbff file can be corrupted. But if you can open and see its content that's not the case.
Best, Vitor
Hi,
thanks for the comment - it is currently building the .faa file.
Another question:
Thanks, Stephan
Hi Stephan,
Is it working because you deleted the kaiju_db.faa
file or are you running the command line provided?
Yes, you can use them together by adding it to the configuration file. For example:
databases:
- "new_custom_fungi_viral_db"
- "archaea_bacteria"
That will run your samples for both databases (the pre-configured and the custom), generating separated results for each one of them. There is no way to run them together as one, since the tools do not provide such functionality.
Best, Vitor
Hi Vitor,
it's running because I deleted the jaiju_db.faa
file.
Thanks for the information about running with multiple databases.
Best, Stephan
Hi Stephan,
Nice, so the solution is the updated version of kaiju because of the perl dependency on installation. I will fix that on the next release.
Best, Vitor
Hi,
I tried creating a new custom database, but when running the pipeline I get the following error:
Database output directory (Tip: create this folder in a common directory so it could be used for other runs as well as other users)
dbdir: "/home/stephan/work/epityp/mirnaseq/05_metameta/databases/"
Sample (name and files)
samples: "sample_name_1": fq1: "/home/stephan/work/epityp/mirnaseq/05_metameta/input/not_mapped.fastq.gz"
Add more samples here
Custom database
databases:
"new_custom_fungi_viral_db": clark: "custom_fungi_viral_db/clark_dudes/" dudes: "custom_fungi_viral_db/clark_dudes/" kaiju: "custom_fungi_viral_db/kaiju/" kraken: "custom_fungi_viral_db/kraken/"
################################################################
Configured tools (p=profiling, b=binning) from tools folder (tool.sm and tool_db.sm)
tools: "clark": "b"
"dudes": "p"
MetaMeta Pipeline
Number of threads for each tool (distributed among the number of cores defined by main parameter --cores)
threads: 12
Gzipped input files (0: not gzipped / 1: gzipped). Default: 0
gzipped: 1
Keep intermediate files (database, reads and output) (0: do not keep files / 1: keep all files). Default: 0
keepfiles: 1 ### TODO change to 0