Closed deropi closed 3 years ago
There are a number of possibilities, some more information might help narrow it down: What data are you training on and what was the exact command that you used? How did you install PlasClass? How much RAM/how many processors do you have on your system.
My first guess is that PlasClass is still running but taking a really long time. Did you see that the program actually finished running? How long did you run it for? Maybe the process is stuck due to a memory bottleneck - can you see how much memory it is using up?
Sure,
I was training a small dataset with 40 chromosomal sequences and 200 plasmids:
train.py -p pls_contigs.fasta -c chr_contigs.fasta -o train_set/ -n 25
I work on a 1TB RAM system, so I guess that should not be the issue.. although I did not check during the process.
I am working on a cluster and I am a little limited in installing software so I installed it via conda. I had to add the shebang to a couple of scripts but the classifier seems to work like a charm.
I let the script running with a nohup command, so the only output I got was what I posted earlier, with no error whatsoever.
Thanks for the help :)
Thanks for the info - I see that I added shebangs to the script after the conda release so it's good that you caught that.
Is there any way you can run it without nohup to see if there is more output and to check that it exits correctly after finishing? Also if possible - check RAM usage while it is running? Or maybe it is just hanging and not exiting at all?
Just based on the fact that it seems to have stopped in the kmer counting phase, it could be a problem with the Python multiprocessing pool that is used. What version of python is being used in your conda env?
@deropi were you able to see if the program really finished?
Hi, sorry for the late reply. The program finished correctly. I did a silly mistake and moved some files while the process was running. Thanks a lot for the help!
Great to hear that it worked
Hi! I've been trying use the train.py script. I don't really get an error...but I don't get the model files either. The directory is empty and the last lines of the stdout are these:
Starting PlasClass training
Getting reference lengths
Sampling 90000 fragments for length 1000
Getting k-mer frequencies
Any ideas what might be going on? Thanks!