Closed levinas closed 9 years ago
The test completed successfully. I'm reopening the issue to make sure the logging is correct.
Here's the megahit.out file I see in the work directory (note that the --input-cmd
parameter contains only one file):
Command: megahit --cpu-only --num-cpu-threads 4 -l 512 -m 131448646144 --input-cmd p1.fq -o megahit
MEGAHIT v0.2.0
[Thu Feb 12 16:25:56 2015] Start assembly. Number of CPU threads 4.
[Thu Feb 12 16:25:56 2015] Extracting solid (k+1)-mers for k = 21
[Thu Feb 12 16:26:01 2015] Building graph for k = 21
[Thu Feb 12 16:26:05 2015] Assembling contigs from SdBG for k = 21
[Thu Feb 12 16:26:11 2015] Extracting iterative edges from k = 21 to 31
[Thu Feb 12 16:26:12 2015] Building graph for k = 31
[Thu Feb 12 16:26:13 2015] Assembling contigs from SdBG for k = 31
[Thu Feb 12 16:26:14 2015] Extracting iterative edges from k = 31 to 41
[Thu Feb 12 16:26:14 2015] Building graph for k = 41
[Thu Feb 12 16:26:14 2015] Assembling contigs from SdBG for k = 41
[Thu Feb 12 16:26:14 2015] Extracting iterative edges from k = 41 to 51
[Thu Feb 12 16:26:15 2015] Building graph for k = 51
[Thu Feb 12 16:26:15 2015] Assembling contigs from SdBG for k = 51
[Thu Feb 12 16:26:15 2015] Extracting iterative edges from k = 51 to 61
[Thu Feb 12 16:26:15 2015] Merging to output final contigs.
[Thu Feb 12 16:26:15 2015] ALL DONE.
The actual command seems correct:
['/home/ubuntu/assembly/third_party/megahit/megahit', '--cpu-only', '--num-cpu-threads', '4', '-l', '512', '-m', '131448646144', '--input-cmd', u'cat /mnt/data/fang
fang/119/raw/p2.fq /mnt/data/fangfang/119/raw/p1.fq', '-o', u'/mnt/data/fangfang/119/107/megahit_fff2ac6b-67d3-42d6-b160-d4ca6b04a211/megahit']
The argument in the list after "input-cmd" is basically "cat " + " ".join(files).
Do you want me to test with more than 1 file before closing the issue ?
I think the command is probably passed correctly. Can you look into how to get the "Command: megabit..." line in the output file to reflect the "cat .." subcommand correctly?
This can be done by printing the command from within the megahit assembly service plugin.
MEGAHIT is now part of dev recipes and testing. The issue will be closed when Seb completes the test on big files.
I tested with a pair of files.
seb@kbase-devel:~/bug-167/job-80$ arast get -j 80 File downloaded: 80_1.megahit_contigs.fa File downloaded: 80_report.txt File downloaded: 80_analysis.tar.gz HTML extracted: 80_analysis/report.html
seb@kbase-devel:~/bug-167/job-80$ ls -lh total 2.4M -rw-rw-r-- 1 seb seb 2.3M Feb 12 23:08 80_1.megahit_contigs.fa drwxrwxr-x 5 seb seb 4.0K Feb 12 23:08 80_analysis -rw-rw-r-- 1 seb seb 6.3K Feb 12 23:08 80_report.txt
seb@kbase-devel:~/bug-167/job-80$ head 80_report.txt All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).
Assembly megahit_contigs
Total length (>= 0 bp) 2300567
Total length (>= 1000 bp) 1807716
Largest contig 132002
Total length 1831638
seb@kbase-devel:~/bug-167/job-80$ arast stat -d|grep 80 | 80 | 4 | Complete | 1:20:47 | None | -p megahit |
root@kbase-devel:/kbase/arast/assembly# git log --oneline|grep megahit 7bcc026 megahit: normalize memory limit using the total thread count af59dd0 plugins: remove foo=bar option for megahit 9b39c30 plugins: add megahit plugin e4600be plugins: add megahit configuration file 8c5061c Remove megahit version and release numbers 1c13a6e tools: add megahit package