Closed trx296554555 closed 8 months ago
Hi Robin,
Thank you so much for bringing this out. We previously utilized this manner due to hmmscan
does not support multithreads but hmmsearch
does. Therefore, we will remove the multi-processing part and just use the multi-threading butil-in function from hmmsearch.
Let me just delete and test codes and I will put 4.1.2 version. Thank you so much!
Best, Le
Our 4.1.2 version is already issued. Problem solved.
Thank you for your hard work and recent updates. However, I wanted to bring to your attention an ongoing issue with the latest version of run_dbcan-4.1.1. When processing large input sequence files, dbcan_sub tends to create an excessive number of threads, resulting in high system load. #117
This issue persists even when specifying parameters such as --dbcan_thread and --hmm_cpu, as there seems to be no effective limitation on the number of threads being created.
After reviewing the code of run_dbcan.py, I have identified that the issue lies within the function split_uniInput. This section of code directly launches as many subprocesses as the number of small files generated by splitting the large input file.https://github.com/linnabrown/run_dbcan/blob/707aed21a0ef455828126f1afb5820963e8274ca/dbcan/cli/run_dbcan.py#L139C1-L157C22
I made modifications to this specific code section to prevent excessive load when I used it myself. I implemented a simple ThreadPool, but I'm unsure if this could potentially affect other parts of the program. Therefore, I offer it as a reference only.
Best, Robin