Closed sauloal closed 9 years ago
Run ulimit -n
and I'm guessing you'll get 1024, which is the max number of open file descriptors for your process. Since it seems like you have 1020 intermediate jf
files, I think you've run out of file descriptors. You can try setting a higher limit with ulimit -n [new limit]
and Jellyfish should be able to keep more files open at once.
You're also using --counter-len=1
which will use one bit and not byte for the hash-table in memory, whereas --out-counter-len=1
will use one byte and not one bit. I suggest you try running with the default --counter-len
of 7, which should hopefully reduce the number of times Jellyfish dumps to a temp file. As it is, once a k-mer has been seen more than once, it steals another slot in the hash-table to use as additional counter space and I'm guessing is why the hash-table is filling up so many times.
Also, -F 900
means Jellyfish can read from 900 input files at once, but you've only provided it one, so I don't think this is useful.
Disclaimer: I'm not a maintainer, but have used Jellyfish for a while and poked around the internals.
Hope that helps.
Regarding the -F switch. It it used when you have more than one input streams, say piping multiple BAM file or multiple zipped files. Usually Jellyfish can read faster than bamToFastq can spit out sequence, hence Jellyfish spends its time waiting. You can have Jellyfish read more than one stream concurrently and make the all process faster.
So here it is not useful.
The description of aconz2 about the counter-len and out-counter-len is correct. In general, if you have an expected coverage of C, choose the counter-len at least log(2*C), where the log is base 2. This way most k-mer will use only one slot in the hash table.
Any reason why you are using the --disk switch? Given the small size of your k-mer (11), everything should happen in memory. So try something like:
jellyfish count -t 10 -m 11 -s 256M --counter-len=7 --out-counter-len=1 --canonical -o sa.fa.bam.filtered.bam.11.jf.tmp <(...)
Hello All,
Thanks for the comments. ulimit solved the problem. thanks Thanks for the advices in -F and --counter-len. I'll try it.
I need the --disk parameter to be able to join the files later on. In a previous conversation I was advised to use it otherwise merge would crash.
Regards
Hello,
When running jellyfish with the following command, piping fastq out of a bam file, I get the following error:
time /home/aflit001/dev/phylogenomics2/jellyfish count -t 10 -F 900 -m 11 -s 128M --disk --counter-len=1 --out-counter-len=1 --canonical -o sa.fa.bam.filtered.bam.11.jf.tmp --timing=sa.fa.bam.filtered.bam.11.jf.t <( bamToFastq -i ./denovo/sa.fa.bam.filtered.bam -fq /dev/stdout )
terminate called after throwing an instance of 'MergeError' what(): Failed to open input file 'sa.fa.bam.filtered.bam.11.jf.tmp1020' Aborted (core dumped)
Can you help with this?