Closed crcardenas closed 1 year ago
This is not really a bug of Phyluce - it's an operating system limitation. I'm not sure what operating system you are using, but, generally, you can adjust the open file limit temporarily using ulimit
. You can use ulimit -a
to show various OS limits. And you can adjust the # of open files allowed using ulimit -n
followed by the number of file descriptors you want to allow.
For example, on my mac:
ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8176
-c: core file size (blocks) 0
-v: address space (kbytes) unlimited
-l: locked-in-memory size (kbytes) unlimited
-u: processes 10666
-n: file descriptors 256
And if I run ulimit -n 512
followed by ulimit -a
, the number of file descriptors allowed changes to 512.
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) unlimited
-s: stack size (kbytes) 8176
-c: core file size (blocks) 0
-v: address space (kbytes) unlimited
-l: locked-in-memory size (kbytes) unlimited
-u: processes 10666
-n: file descriptors 512
Thanks for the quick response and your time. I am on a private linux server (Ubuntu 16.04.4 LTS, GNU/Linux 4.4.0-119-generic x86_64).
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 2063412
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 2063412
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Looking at the upper limit shows this is as high as I can take it:
$ulimit -Sn
1024
$ ulimit -Hn
1024
I expect it depends on the data, but is there an expected upper limit with 95 taxa? Or, if I need to do this again, should I be subset the data when using phyluce_probe_run_multiple_lastzs_sqlite?
I've set mine to 4096 and rarely have an issue. I usually also process smaller batches of taxa at one time (e.g. 4 batches of 24 in the case of 96 taxa).
You can also change ulimit permanently by OS - check google for your OS. Also see here for some discussion.
Thank you so much for your help!
you're welcome 👍. good luck w/ your research!
I am "slicing" UCE's from different data types, anchored hybrid enrichment (ahe), transcriptomes, and genomes, following the "Harvesting UCEs from genomes" tutorial. I subset by data type and perform lastz_sqlite step. After running the script, I get an interesting error that causes the search to fail. Of the 95 ahe data, 10 fail to get processed due to this error:
I have further subset the ahe data into smaller sets and it is currently running fine, but I thought you might want to know that this is an issue.
I'm not sure what other information you might need, but I am happy to provide more detail if you would like.