Closed timyaro closed 2 years ago
Hi timyaro, I can't give a precise estimate, but for sure that is a quit big amount of reads (roughly 5M reads), I expect this process might take a couple of days to complete. Unfortunately, pipelines based on single-read alignment are quite slow. I think from "htop" command you may be able to see where temporary files are being stored, and use that file to count the amount of reads (1 read -> 1 row) that have already been processed. SM
Hi, I am closing this issue due to inactivity. In case you have any further issues, please reopen it. SM
Hello @MaestSi . I appreciate the reply! I have reached a new obstacle.
I am currently using Ubuntu and I got a stop error for cluster-features-de-novo (stopped by itself and I have no idea why). I got 2.8 million reads out of my 3.46 million reads processed with the tmp file and contents still in the tmp folder (this processing took 9 days). I was wondering if there's any way for it to pick up where it left off or do I have to restart the entire process again?
I'm almost tempted to scale it to AWS for 96 vCPU and an absurd amount of memory and was wondering if this would reduce it down to a couple hours instead of days? I don't want to resort to this because of costs.
Hi @timyaro , I fear there is no easy way to resume it from where it crashed, as fare as I know. I think the error may be due to not enough RAM memory available. Indeed, 16GB is quite a low amount of RAM. My advice would be to re-run the analysis by random sampling 10k-30k reads per sample first. In parallel, you may try running it on AWS, but I am not confident the reduction in run time would be so drastic. SM
Hello,
Sorry if this question is naive.
VSearch dereplicate part of the metontiime.sh code is taking a really long time (16 hours). I was wondering about how long is this approximately going to take with the following specs listed below:
Let me know if you have any suggestions or thoughts!