Closed johnsolk closed 7 years ago
Two possibilities. First, you should probably omit the sudo -- dammit doesn't need administrator privileges to run, and it changes the $PATH
variable. Second, try calling popen
with shell=True
, which should export your environment variables. BUSCO, TransDecoder, and LAST were installed manually and the exports are in your .bashrc
, so without that being sources (ie, shell=True), they aren't being found.
The relevant docs for popen
are here: https://docs.python.org/2/library/subprocess.html#popen-constructor
Lemme know if that helps!
Removing sudo worked, running now. Thank you! I was originally using shell=True
, just forgot to put that in my question.
If you stop then restart again, will the pipeline pick up form where it left off? It's running, but at first there were a few misc errors about not finding busco and tblastn results (below). Should I just wait for it to finish to see how it worked?
New DB title: Trinity.fasta
Sequence type: Nucleotide
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B
Adding sequences from FASTA; added 16048 sequences in 0.548864 seconds.
BLAST Database error: CSeqDBAtlas::MapMmap: While mapping file [/mnt/mmetsp/Micromonas_pusilla/SRR1300455/dammit_dir/Trinity.fasta.dammit/Trinity.fasta.busco.results.nin] with 0 bytes allocated, caught exception:
NCBI C++ Exception:
"/build/buildd/ncbi-blast+-2.2.28/c++/src/corelib/ncbifile.cpp", line 4703: Error: ncbi::CMemoryFileMap::CMemoryFileMap() - To be memory mapped the file must exist: /mnt/mmetsp/Micromonas_pusilla/SRR1300455/dammit_dir/Trinity.fasta.dammit/Trinity.fasta.busco.results.nin
eukaryota
*** Running tBlastN ***
*** Getting coordinates for candidate transcripts! ***
Traceback (most recent call last):
File "/home/ubuntu/BUSCO_v1.1b1/BUSCO_v1.1b1.py", line 347, in <module>
f=open('%s_tblastn' % args['abrev']) #open input file
FileNotFoundError: [Errno 2] No such file or directory: 'Trinity.fasta.busco.results_tblastn'
[ ] TransDecoder.LongOrfs:Trinity.fasta
CMD: /home/ubuntu/TransDecoder-2.0.1/util/compute_base_probs.pl Trinity.fasta 0 > Trinity.fasta.transdecoder_dir/base_freqs.dat
-first extracting base frequencies, we'll need them later.
CMD: touch Trinity.fasta.transdecoder_dir/base_freqs.dat.ok
- extracting ORFs from transcripts.
-total transcripts to examine: 16048
[16000/16048] = 99.70% done
#################################
### Done preparing long ORFs. ###
##################################
Use file: Trinity.fasta.transdecoder_dir/longest_orfs.pep for Pfam and/or BlastP searches to enable homology-based coding region identification.
Then, run TransDecoder.Predict for your final coding region predictions.
[ ] hmmscan:longest_orfs.pep.x.Pfam-A.hmm
It should resume without issues -- if it doesn't, please let me know :)
On Sun, Dec 6, 2015 at 3:41 PM, ljcohen notifications@github.com wrote:
Removing sudo worked, running now. Thank you! I was originally using shell=True, just forgot to put that in my question.
If you stop then restart again, will the pipeline pick up form where it left off? It's running, but at first there were a few misc errors about not finding busco and tblastn results (below). Should I just wait for it to finish to see how it worked?
New DB title: Trinity.fasta Sequence type: Nucleotide Keep Linkouts: T Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 16048 sequences in 0.548864 seconds. BLAST Database error: CSeqDBAtlas::MapMmap: While mapping file [/mnt/mmetsp/Micromonas_pusilla/SRR1300455/dammit_dir/Trinity.fasta.dammit/Trinity.fasta.busco.results.nin] with 0 bytes allocated, caught exception: NCBI C++ Exception: "/build/buildd/ncbi-blast+-2.2.28/c++/src/corelib/ncbifile.cpp", line 4703: Error: ncbi::CMemoryFileMap::CMemoryFileMap() - To be memory mapped the file must exist: /mnt/mmetsp/Micromonas_pusilla/SRR1300455/dammit_dir/Trinity.fasta.dammit/Trinity.fasta.busco.results.nin
eukaryota * Running tBlastN * * Getting coordinates for candidate transcripts! * Traceback (most recent call last): File "/home/ubuntu/BUSCO_v1.1b1/BUSCO_v1.1b1.py", line 347, in
f=open('%s_tblastn' % args['abrev']) #open input file FileNotFoundError: [Errno 2] No such file or directory: 'Trinity.fasta.busco.results_tblastn' [ ] TransDecoder.LongOrfs:Trinity.fasta CMD: /home/ubuntu/TransDecoder-2.0.1/util/compute_base_probs.pl Trinity.fasta 0 > Trinity.fasta.transdecoder_dir/base_freqs.dat -first extracting base frequencies, we'll need them later. CMD: touch Trinity.fasta.transdecoder_dir/base_freqs.dat.ok
- extracting ORFs from transcripts. -total transcripts to examine: 16048 [16000/16048] = 99.70% done
#################################
Done preparing long ORFs.
##################################
Use file: Trinity.fasta.transdecoder_dir/longest_orfs.pep for Pfam and/or BlastP searches to enable homology-based coding region identification. Then, run TransDecoder.Predict for your final coding region predictions. [ ] hmmscan:longest_orfs.pep.x.Pfam-A.hmm
— Reply to this email directly or view it on GitHub https://github.com/camillescott/dammit/issues/34#issuecomment-162371028.
Camille Scott
Department of Computer Science Lab for Data Intensive Biology University of California, Davis
camille.scott.w@gmail.com
I wrote a script to run dammit separately for many assemblies. The script writes and runs a dammitfile for each command. Example contents of dammitfile:
But when the automated script runs with subprocess.Popen("sudo bash"+dammitfile), there is an error that some but not all of the dependencies (TransDecoder, LAST, BUSCO) are not installed (below). I can manually run the same command above and it works fine with no problems. Is there something I can do to fix why the subprocess is not finding the dependencies?
submodule: annotate