Closed carolynzy closed 3 years ago
Hi @carolynzy , thanks for the email. Note that I might take some time to reply as I am away from the lab, perhaps @vinuesa can also help. I would start with the issue of the number of taxa; you should check the messages printed out during parsing gbk/faa files. Make sure you really have 420 files in the input folder and that all of them are read in as expected. Second, when running parallel, make sure you control how many processes you spawn with parameter --jobs so that you keep memory within bounds. Bruno
Thank you @brunocontrerasmoreira ! I think I have found the problem. I have an interrupted process previously, but I didn't delete the tmp folder and just resumed the process. So the less taxa found probably is due to this. Then I used --jobs -2 in the parallel command, the process is going on well right now. No error or warning showing up. So far so good.
Hi, I'm using get_homologues.pl to analyse my 420 samples of MTB. I have finished the blastn step with -o and run in dryrun mode as described in the manual. Then when I was running parallel with dryrun.txt, I had two problems. After the program runnig for a while, the screen printed out: # construct_taxa_indexes: number of taxa found = 69 # out of memory
My questions are:
Despite this message, the process continued and the "# construct_taxa_indexes: number of taxa found = 69" repeated many times but some times the "out of memory" doesn't show up.
I hope I have made myself clear. I desperately need your help. I have spent the past two weeks working on this and have been stuck here. Thank you so much for your time and advice!