Open yejunbin opened 1 month ago
Hi,
Thank you for providing the detailed log. The merge_snps
process for 200 samples should not take 3 days. It seems like the issue might be related to memory limitations or CPU thrashing.
Could you confirm the total memory available on your machine? This task is memory-intensive, and if progress has stalled for 3 days, it’s possible the machine was overwhelmed. The call_and_write_population_snps
step loads chunk pileups from all samples into memory to calculate population SNPs. The more cores you use, the more memory your system needs. For 200 samples, I recommend using a machine with at least 120 GB of memory and 16 cores (using --num_cores 16), while keeping the default chunk size. If your machine has more memory, you can try increasing to 32 cores.
A few additional notes:
Are you using vCPUs or physical CPUs?
The --chunk_size 200000 isn’t the default chunk size. I recommend running:
midas compute_chunks --chunk_type merge_snps --chunk_size 200000 --species all --midasdb_name $db_name --midasdb_dir $db_dir --debug --force -t ${num_cores}
This will calculate the chunk information accordingly.
Let me know if this works for you!
Best Chunyu
Description:
I encountered an issue when running the
midas2 merge_snps
command. The process has been running for several days without noticeable progress. The log file seems to show repeated "start" and "finish" messages for various species, but many of the output folders for certain species are either empty or incomplete.Steps to Reproduce:
Observed Behavior:
The process has been running for 3 days with no significant progress.
The log file shows repeated messages of "start" and "finish" for
accumulate_samples
andcall_and_write_population_snps
, as shown below:Many species result directories in
midas2_merge/snps/
are empty or contain only partial files. For example:Expected Behavior:
merge_snps
command should complete within a reasonable time frame and produce merged SNP files for all species without leaving empty or incomplete folders.System Information:
Log File Excerpts:
Here are some excerpts from the log file for reference:
Request:
Could you please investigate this issue? Any guidance on how to resolve it would be greatly appreciated. I'm particularly concerned about the empty species folders and the long runtime without progress.
Thank you for your help!