Closed morgansobol closed 1 year ago
Hmm... that is curious. I ran 36k genomes, so the absolute number is likely not the issue. (Though the size of your individual genomes may be larger.)
A few thoughts:
Let me know what you find and if I can be of any help!
Best, David
Hi David,
I could not find the log output files in output/genomes/ because it was getting hung up and the entire run killed. So I went ahead and tried point 2, which worked for me. I was able to have the prediction done in ~about one hour. yay!
But, I have ~100 genomes that could not be predicted since no rRNA could be found. This is probably because of the same reason in issue #2, since some of my genomes are MAGs. How do you change the script to not have it rely solely on 16S? Can I simply re-run it only on those genomes that did not work the first time?
Okay, so I have re-arranged the regression models so those which specifically exclude certain feature classes are neatly organized into their own subdirectory. So when doing your predictions, pick the appropriate subdirectory from data/calculations/prediction/regression_models/
Hopefully this helps!
Hi again David,
I am having some issues with memory allocation...I guess? I allocated 20 Gb of memory when I submitted the job. I have ~630 genomes. It seems to only have issues with tRNA and barrnap, and right now it's hard to tell if this is for every genome or just some. Have you any experience with these issues?
error with barrnap bacterial for GCF_002240205.1.fa with a message of sh: line 1: 1282914 Killed
error with tRNAscan for GCF_001747405.1.fa with a message of sh: line 1: 1279739 Killed
error with tRNAscan for GCF_000019165.1.fa with a message of sh: line 1: 295420 Bus error (core dumped)
Here is the full version of the prediction.log file as it is now, its still running after 15.5 hrs. https://www.dropbox.com/s/dxybinnbcrui3ke/prediction.log?dl=0
Thx!