dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

step 3 ipyrad error #507

Closed luchoamador closed 1 year ago

luchoamador commented 1 year ago

Hello,

I am running ipyrad in the Uni cluster, my data are 48 samples, single-end obtained from ddRadSeq. I am having an error during step 3, and I can not figure out what is the source of the error. Any help would be very appreciated.

This is the code and the error message:

**(ipyrad) [lamador@wheeler analysis_files]$ ipyrad -p params-params_Hyla.txt -s 3 -c 12 loading Assembly: params_Hyla from saved path: ~/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_Hyla/params_Hyla.json


ipyrad [v.0.9.50] Interactive assembly and analysis of RAD-seq data


Parallel connection | wheeler: 12 cores

Step 3: Clustering/Mapping reads within samples [####################] 100% 0:02:46 | dereplicating
[####################] 100% 3:28:21 | clustering/mapping

Encountered an Error. Message: IPyradError: cmd ['/users/lamador/.conda/envs/ipyrad/bin/vsearch', '-cluster_smallmem', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_Hyla/params_Hyla-tmpalign/Hyla_chrysoscelis_LNB00591.1_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_Hyla/params_Hyla_clust_0.85/Hyla_chrysoscelis_LNB00591.1.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '2', '-notmatched', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_Hyla/params_Hyla_clust_0.85/Hyla_chrysoscelis_LNB00591.1.htemp', '-fasta_width', '0', '--minseqlength', '50', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.18.0_linux_x86_64, 47.0GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading file /users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_Hyla/params_Hyla-tmpalign/Hyla_chrysoscelis_LNB00591.1_derep.fa 100%\n175261788 nt in 1329171 seqs, min 50, max 133, avg 132\nMasking 100%\nCounting k-mers 100%\nClustering'

Parallel connection closed.**

Thank you very much in advance, Luis

isaacovercast commented 1 year ago

Hi Luis,

There are a couple things to try:

1) Run ipyrad step 3 again and include the -d flag to print full debug info

2) Upgrade to the most recent version of ipyrad (0.9.90) and try again

Please let me know if you get a chance to do either of those things and let me know the results.

luchoamador commented 1 year ago

Hi Isaac, thank you for your reply!

I was able to upgrade ipyrad, and I ran again step 3 with the -d flag. I think I recovered the same error, but now print the full debug info. I am sorry for the super long text:

(ipyrad) [lamador@hopper analysis_files]$ ipyrad -p params-test.txt -s 3 -c 8 -d loading Assembly: ipyrad_test from saved path: ~/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test.json


ipyrad [v.0.9.90] Interactive assembly and analysis of RAD-seq data


Parallel connection | hopper: 8 cores

Step 3: Clustering/Mapping reads within samples [####################] 100% 0:02:07 | dereplicating
[####################] 100% 3:36:50 | clustering/mapping

Encountered an Error. Message: IPyradError: cmd ['/users/lamador/.conda/envs/ipyrad/bin/vsearch', '-cluster_smallmem', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test-tmpalign/Hyla_chrysoscelis_LNB00591.1_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test_clust_0.85/Hyla_chrysoscelis_LNB00591.1.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '2', '-notmatched', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test_clust_0.85/Hyla_chrysoscelis_LNB00591.1.htemp', '-fasta_width', '0', '--minseqlength', '50', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.18.0_linux_x86_64, 92.7GB RAM, 64 cores\nhttps://github.com/torognes/vsearch\n\nReading file /users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test-tmpalign/Hyla_chrysoscelis_LNB00591.1_derep.fa 100%\n181960309 nt in 1330185 seqs, min 50, max 138, avg 137\nMasking 100%\nCounting k-mers 100%\nClustering' Error: ipcluster shutdown and must be restarted Parallel connection closed. ---------------------------------------------------------------------------IPyradError Traceback (most recent call last) in ~/.conda/envs/ipyrad/lib/python3.7/site-packages/ipyrad/assemble/clustmap.py in cluster(data, sample, nthreads, force) 1106 # check for errors 1107 if proc.returncode: -> 1108 raise IPyradError("cmd {}: {}".format(cmd, res)) 1109 1110 IPyradError: cmd ['/users/lamador/.conda/envs/ipyrad/bin/vsearch', '-cluster_smallmem', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test-tmpalign/Hyla_chrysoscelis_LNB00591.1_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test_clust_0.85/Hyla_chrysoscelis_LNB00591.1.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '2', '-notmatched', '/users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test_clust_0.85/Hyla_chrysoscelis_LNB00591.1.htemp', '-fasta_width', '0', '--minseqlength', '50', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.18.0_linux_x86_64, 92.7GB RAM, 64 cores\nhttps://github.com/torognes/vsearch\n\nReading file /users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test-tmpalign/Hyla_chrysoscelis_LNB00591.1_derep.fa 100%\n181960309 nt in 1330185 seqs, min 50, max 138, avg 137\nMasking 100%\nCounting k-mers 100%\nClustering' (ipyrad) [lamador@hopper analysis_files]$ ---------------------------------------------------------------------------IPyradError Traceback (most recent call last) in -bash: syntax error near unexpected token (' (ipyrad) [lamador@hopper analysis_files]$ ~/.conda/envs/ipyrad/lib/python3.7/site-packages/ipyrad/assemble/clustmap.py in cluster(data, sample, nthreads, force) -bash: syntax error near unexpected token(' (ipyrad) [lamador@hopper analysis_files]$ 1106 # check for errors -bash: 1106: command not found (ipyrad) [lamador@hopper analysis_files]$ 1107 if proc.returncode: -bash: 1107: command not found (ipyrad) [lamador@hopper analysis_files]$ -> 1108 raise IPyradError("cmd {}: {}".format(cmd, res)) -bash: syntax error near unexpected token `(' (ipyrad) [lamador@hopper analysis_files]$ 1109 -bash: 1109: command not found (ipyrad) [lamador@hopper analysis_files]$ 1110 -bash: 1110: command not found

Thanks again, Luis

isaacovercast commented 1 year ago

Hi Luis, Hm, that is weird, it looks like the clustering command is just dying and not telling us why. Can you try running this command by hand inside the ipyrad environment (just copy/paste this whole thing into your terminal, it should work):

/users/lamador/.conda/envs/ipyrad/bin/vsearch -cluster_smallmem /users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test-tmpalign/Hyla_chrysoscelis_LNB00591.1_derep.fa -strand plus -query_cov 0.5 -id 0.85 -minsl 0.5 -userout /users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test_clust_0.85/Hyla_chrysoscelis_LNB00591.1.utemp -userfields query+target+id+gaps+qstrand+qcov -maxaccepts 1 -maxrejects 0 -threads 2 -notmatched /users/lamador/Documents/US_amphibians_PopGenomics/Hyla_species/ipyrad_output/ipyrad_test_clust_0.85/Hyla_chrysoscelis_LNB00591.1.htemp -fasta_width 0 --minseqlength 50 -fulldp -usersort

Please show me the full output of this command.

luchoamador commented 1 year ago

Hi Isaac,

Thank you so much for the time and the help provided.

I ran the command line but stop after a few minutes. This was the start of the command output:

_WARNING: Option --fulldp is ignored vsearch v2.18.0_linux_x8664, 92.7GB RAM, 64 cores https://github.com/torognes/vsearch ...

And the end of the output was:

Masking 100% Counting k-mers 100% Clustering 78%CPU time limit exceeded (core dumped)

I am not sure if the problem is something with the cluster, instead the program.

Thank you again, Luis

isaacovercast commented 1 year ago

Hi Luis,

Yes, it definitely 100% is the cluster and not ipyrad. The CPU time limit exceeded message is coming from the cluster. Can you try allocating more time in your job submission script? That is 100% the problem, running out of time allocation on the cluster.

Hope you are doing well, -isaac

luchoamador commented 1 year ago

Hi Isaac,

Just to add as a following up about this issue. I was able to run ipyrad in the cluster with using a slurm script thanks to the suggestion of a colleague at UNM.

Thank you again for your help! Luis

isaacovercast commented 1 year ago

Hi Luis, thanks for the update, glad you got it working! -isaac

josefbenito commented 1 month ago

Hi Isaac,

I am facing the same issue that Luis had mentioned in this thread. I am trying to get ipyrad v0.9.95 running on HPC cluster which runs perfectly fine on my MacBook pro and our Dell workstation. I tried running the full clustering script inside the ipyrad environment and this is what I got,

(base) uahjbb001@asaxlogin3:~> /home/uahjbb001/.conda/envs/ipyrad/bin/vsearch -cluster_smallmem /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa -strand plus -query_cov 0.5 -id 0.85 -minsl 0.5 -userout /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp -userfields query+target+id+gaps+qstrand+qcov -maxaccepts 1 -maxrejects 0 -threads 2 -notmatched /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp -fasta_width 0 --minseqlength 35 -fulldp -usersort WARNING: Option --fulldp is ignored vsearch v2.28.1_linux_x86_64, 57.6GB RAM, 8 cores https://github.com/torognes/vsearch

Reading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%
1201071911 nt in 4173698 seqs, min 35, max 290, avg 288 Masking 100% Counting k-mers 100% Clustering 10%Killed

In my case, the task is not running out of time allocation on the cluster but gets killed after 10% clustering. Could you please look into it and suggests me any troubleshooting tips/tricks? Thanks!

Joseph

isaacovercast commented 1 month ago

Hi Joseph, If ipyrad is getting killed on your HPC system then it could be for a couple of different reasons. You already said it's not because of time, but it could also be because of other resource issues like running out of disk space, using too much ram, etc. There are many ways that HPC admins can constrain resource usage so it might be best to ask your HPC admins. What I can tell you is that for step 3 the RAM usage can be an issue, particularly if your raw file (AW-11-88_derep.fa) is very large or if the genome of the org is very large (which it probably is if it's a salamander). 56GB seems like it should be enough, but it might not be (if the reads are long and paired-end especially so). Let me know what you find based on this information and we'll try to get it moving again.

luchoamador commented 1 month ago

Hi Joseph and Isaac,

Coincidentally, I had the same error last night running step 3 for paired-end reads from 90 samples of a frog species. I was running it directly in one of the supercomputer of the University because our designed partition is busy. How I solved previously (as it can see in the previous messages) was using a slurm script where you can indicate the --cpus-per-task and the --time you would need for the step.

I hope this can help.

Best, Luis

josefbenito commented 1 month ago

Hi Isaac and Luis,

Thanks for your quick reply.

Isaac: Disk space is not an issue. Each user is assigned 10 TB on our HPC cluster. Size of the genome/file should also not be an issue as I chose only fastq files of ideal size for the test run. I posted here only after exchanging few emails about this issue with our HPC admin. He didn't mention anything about RAM usage constrain. I am very sure the issue is something to do with parallelization, RAM/cores used for the run.

Luis: Could you please share the slurm script with --cpus-per-task --time flags you used for your run?

I tried a slurm script just like the one mentioned in 'Running ipyrad on a cluster' of ipyrad tutorial with --cpus-per-task 32 (-c 32) and ended up with the same error again,


ipyrad [v.0.9.95] Interactive assembly and analysis of RAD-seq data


Parallel connection | asaxlogin2.asc.edu: 32 cores

Step 3: Clustering/Mapping reads within samples [####################] 100% 0:01:19 | join merged pairs
[####################] 100% 0:01:09 | join unmerged pairs
[####################] 100% 0:00:49 | dereplicating
[####################] 100% 0:10:20 | clustering/mapping

Encountered an Error. Message: IPyradError: cmd ['/home/uahjbb001/.conda/envs/ipyrad/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '2', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering' Error: ipcluster shutdown and must be restarted Parallel connection closed.

IPyradError Traceback (most recent call last) File :1

File ~/.local/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1108, in cluster(data, sample, nthreads, force) 1106 # check for errors 1107 if proc.returncode: -> 1108 raise IPyradError("cmd {}: {}".format(cmd, res))

IPyradError: cmd ['/home/uahjbb001/.conda/envs/ipyrad/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '2', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering'

Any thoughts or inputs would be really helpful!

Joseph

isaacovercast commented 1 month ago

I see in the debug output that vsearch thinks it is getting 8 cores, which is weird because the -threads argument is set to 2. If you are asking for 32 cores from the HPC and vsearch is 'grabbing' more cores than it should, the HPC system might not like this and might kill the task. This would explain why it dies in the "Clusterin" stage, which is where the multiprocessing happens. You could try using 32 cores in the HPC script, but then giving ipyrad 'fewer' cores for the ipcluster instance -c 16 or something like that, or you could try manipulating the number of threads assigned to vsearch with the ipyrad -t 1 argument.

josefbenito commented 1 month ago

Hi Isaac,

Thanks for the quick reply. I am not submitting the job to the PBS queue of our HPC system. If I do that, the job runs indefinitely and gets killed once it runs out of wall time with no error message. Lately, I am doing all my test runs interactively on the command line. At least, I am able to see the progress of the run and the error message at the end. I tried what you suggested, ipyrad -p params-test.txt -s 3 -f -d -c 16 -t 1 (16 cores and 1 thread). Still the same error,


ipyrad [v.0.9.95] Interactive assembly and analysis of RAD-seq data


Parallel connection | asaxlogin2.asc.edu: 16 cores

Step 3: Clustering/Mapping reads within samples [####################] 100% 0:01:18 | join merged pairs
[####################] 100% 0:01:02 | join unmerged pairs
[####################] 100% 0:00:37 | dereplicating
[####################] 100% 0:10:25 | clustering/mapping

Encountered an Error. Message: IPyradError: cmd ['/usr/local/anaconda/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '1', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering' Error: ipcluster shutdown and must be restarted Parallel connection closed.

IPyradError Traceback (most recent call last) File :1

File ~/.local/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1108, in cluster(data, sample, nthreads, force) 1106 # check for errors 1107 if proc.returncode: -> 1108 raise IPyradError("cmd {}: {}".format(cmd, res))

IPyradError: cmd ['/usr/local/anaconda/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '1', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering'

Just like you mentioned, whatever cores I specify at the beginning using the -c flag, I see the same in the first line 'Parallel connection | asaxlogin2.asc.edu: 16 cores' but the error message always says '8 cores'. No idea why.

Joseph

isaacovercast commented 1 month ago

It occurs to me that maybe vsearch is just reporting the system level resources here: "62.4GB RAM, 8 cores", so maybe it's not something to worry about. You can also watch how many cores it's actually using with top while it is running.

Also, if you are running on interactive nodes these sometimes have resource limitations (for example 8 cores only), so if you run a 16 core job on an 8 core compute node this can sometimes trigger a sigkill if the system is set that way. It really feels like you are just overstepping a bound on the cluster and it's hard killing the process.

Also, if you are having problems with ipyrad running on the pbs queue, did you take a look at the ipyrad hpc faq https://ipyrad.readthedocs.io/en/latest/HPC_script.html#optional-controlling-ipcluster-by-hand? This might help get that part working.

-isaac

On Sun, May 26, 2024, 07:06 Joseph B. Benito @.***> wrote:

Hi Isaac,

Thanks for the quick reply. I am not submitting the job to the PBS queue of our HPC system. If I do that, the job runs indefinitely and gets killed once it runs out of wall time with no error message. Lately, I am doing all my test runs interactively on the command line. At least, I am able to see the progress of the run and the error message at the end. I tried what you suggested, ipyrad -p params-test.txt -s 3 -f -d -c 16 -t 1 (16 cores and 1 thread). Still the same error,

ipyrad [v.0.9.95] Interactive assembly and analysis of RAD-seq data

Parallel connection | asaxlogin2.asc.edu: 16 cores

Step 3: Clustering/Mapping reads within samples [####################] 100% 0:01:18 | join merged pairs [####################] 100% 0:01:02 | join unmerged pairs [####################] 100% 0:00:37 | dereplicating [####################] 100% 0:10:25 | clustering/mapping Encountered an Error. Message: IPyradError: cmd ['/usr/local/anaconda/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '1', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading http://github.com/torognes/vsearch%5Cn%5CnReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering' Error: ipcluster shutdown and must be restarted Parallel connection closed.

IPyradError Traceback (most recent call last) File :1

File ~/.local/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1108, in cluster(data, sample, nthreads, force) 1106 # check for errors 1107 if proc.returncode: -> 1108 raise IPyradError("cmd {}: {}".format(cmd, res))

IPyradError: cmd ['/usr/local/anaconda/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '1', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading http://github.com/torognes/vsearch%5Cn%5CnReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering'

Just like you mentioned, whatever cores I specify at the beginning using the -c flag, I see the same in the first line 'Parallel connection | asaxlogin2.asc.edu: 16 cores' but the error message always says '8 cores'. No idea why.

Joseph

— Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/507#issuecomment-2132068222, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNSXPYZK6IHI4Y7TWERAJTZEFUWNAVCNFSM6AAAAAAYCWRFQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZSGA3DQMRSGI . You are receiving this because you modified the open/close state.Message ID: @.***>

luchoamador commented 1 month ago

Hi Joseph, Here is the slurm script that I used to run step 3:

######################################################################################

!/bin/bash

SBATCH --ntasks=1

SBATCH --cpus-per-task=8

SBATCH --time=48:00:00

SBATCH --job-name=pipiens_step3_min74

SBATCH --output=_ipyrad3outRp%j

SBATCH --error=_ipyrad3errorRp%j

SBATCH --partition=mypartition

SBATCH --account=mynumberaccount

SBATCH --mail-type=FAIL,END

SBATCH --mail-user=myemail

module load miniconda3 source activate ipyrad

ipyrad -p params_file_Rpipiens.txt -s 3

###########################################################################

I hope this could help.

Luis

El sáb, 25 may 2024 a las 0:50, Joseph B. Benito @.***>) escribió:

Hi Isaac and Luis,

Thanks for your quick reply.

Isaac: Disk space is not an issue. Each user is assigned 10 TB on our HPC cluster. Size of the genome/file should also not be an issue as I chose only fastq files of ideal size for the test run. I posted here only after exchanging few emails about this issue with our HPC admin. He didn't mention anything about RAM usage constrain. I am very sure the issue is something to do with parallelization, RAM/cores used for the run.

Luis: Could you please share the slurm script with --cpus-per-task --time flags you used for your run?

I tried a slurm script just like the one mentioned in 'Running ipyrad on a cluster' of ipyrad tutorial with --cpus-per-task 32 (-c 32) and ended up with the same error again,

ipyrad [v.0.9.95] Interactive assembly and analysis of RAD-seq data

Parallel connection | asaxlogin2.asc.edu: 32 cores

Step 3: Clustering/Mapping reads within samples [####################] 100% 0:01:19 | join merged pairs [####################] 100% 0:01:09 | join unmerged pairs [####################] 100% 0:00:49 | dereplicating [####################] 100% 0:10:20 | clustering/mapping Encountered an Error. Message: IPyradError: cmd ['/home/uahjbb001/.conda/envs/ipyrad/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '2', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading http://github.com/torognes/vsearch%5Cn%5CnReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering' Error: ipcluster shutdown and must be restarted Parallel connection closed.

IPyradError Traceback (most recent call last) File :1

File ~/.local/lib/python3.10/site-packages/ipyrad/assemble/clustmap.py:1108, in cluster(data, sample, nthreads, force) 1106 # check for errors 1107 if proc.returncode: -> 1108 raise IPyradError("cmd {}: {}".format(cmd, res))

IPyradError: cmd ['/home/uahjbb001/.conda/envs/ipyrad/bin/vsearch', '-cluster_smallmem', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa', '-strand', 'plus', '-query_cov', '0.5', '-id', '0.85', '-minsl', '0.5', '-userout', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.utemp', '-userfields', 'query+target+id+gaps+qstrand+qcov', '-maxaccepts', '1', '-maxrejects', '0', '-threads', '2', '-notmatched', '/home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun_clust_0.85/AW-11-88.htemp', '-fasta_width', '0', '--minseqlength', '35', '-fulldp', '-usersort']: b'WARNING: Option --fulldp is ignored\nvsearch v2.28.1_linux_x86_64, 62.4GB RAM, 8 cores\nhttps://github.com/torognes/vsearch\n\nReading http://github.com/torognes/vsearch%5Cn%5CnReading file /home/uahjbb001/Cave_salamander_RADSeq/Part2_testrun-tmpalign/AW-11-88_derep.fa 100%\n1201071911 nt in 4173698 seqs, min 35, max 290, avg 288\nMasking 100%\nCounting k-mers 100%\nClustering'

Any thoughts or inputs would be really helpful!

Joseph

— Reply to this email directly, view it on GitHub https://github.com/dereneaton/ipyrad/issues/507#issuecomment-2130954711, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGB5T7PR5IIHMAQQV6EJYGDZEAYDFAVCNFSM6AAAAAAYCWRFQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZQHE2TINZRGE . You are receiving this because you authored the thread.Message ID: @.***>

josefbenito commented 1 month ago

Hi Isaac and Luis,

Thanks for your reply.

Isaac: This tip of yours worked! "You could try using 32 cores in the HPC script, but then giving ipyrad 'fewer' cores for the ipcluster instance -c 16 or something like that, or you could try manipulating the number of threads assigned to vsearch with the ipyrad -t 1 argument".

I was able to submit ipyrad job (ipyrad -p params-test.txt -s 3 -d -c 40 -t 4) to the PBS queue requesting 50 cores, 120 gb RAM, and 360 hrs wall time of our HPC cluster. The job is currently running and I can see the 'Test_run-tmpalign' and 'Test_run_clust_0.85' folders created and populated with merged and nonmerged fastq files.

Please confirm the max cores and threads we can specify for ipyrad with the -c and -t flags. I guess the default and max is 40 cores but what about threads? I don't exactly get how threads work but I believe a run with -c 40 -t 4 will be faster than -c 40 -t 2 (the default). Please confirm.

Thanks for your back to back quick reply and pointing out the issue. I appreciate it!

Joseph

isaacovercast commented 1 month ago

Hi Joseph, Glad it worked! There is no max for cores (-c), it is limited by your hardware. The max number of threads is also unlimited, but i wouldn't think the performance would improve too much by increasing it beyond about 4 or so, though I've never really tested it very much with higher values.