Closed ospfsg closed 3 months ago
Hello, thank u for sending such a detailed account of the issue, its helpful. I will bet that the problem is '-t 100'. The -t argument specifies the number of threads per core for clustering, the default is 2. With -t 100 i bet its just crunching on spawning too many threads. Try leaving -t as default.
Hi I removed the -t but is back to the same place...17 hours 0% in step 3 after join unmerged pairs the next, clustering and mapping....
I will keep this running for a while but I am going to try a subset in another server...
thank you osp
How many raw reads per sample after step 1? How long are the reads? 150bp for R1/R2 or longer? There are many factors that can influence the time for step 3 including # of raw reads per sample, length of reads (particularly for paired-end data), genome size, size selection window, frequency of enzyme recognition sites. If you can give me more ideas about all these things I can help a bit, but at the end of the day sometimes things just take more time.
Did you ever get step 3 to complete? Or to run a bit faster? If you would like more ideas on performance as a function of the format of your data I'm happy to help if u would like to re-open this issue.
Hi
I am running ipyrad v0.9.95 , with 180 pared end GBS with samples, denovo. In step 3 after join unmerged pairs the next, clustering and mapping, is still at 0%, after 19 hours. I am using 50 cores and almost 300GB of RAM in use (can go until 512).
Samples were preprocess with fastp to remove smaller than 80 read length, overrepresentative, poly G and quality less than 20 and remove adapters
I already have filter adapter to stricter (2), phred Qscore offset at 33.
I started with this command: ipyrad -p params-lim_20240627.txt -s 1234567 -c 50 --MPI -t 100
any suggestion how to make this faster? osp