Closed agroppi closed 8 years ago
Hi,
I have modified the script get_draft_path.py by adding "shell=True" :
stream = subprocess.Popen(DBshow_cmd.split(),
stdout=subprocess.PIPE,bufsize=1,shell=True)
It seems to improve the process, but now I have another error :
Run postprocessing: Mon Aug 29 16:11:39 CEST 2016
#############################
get draft assembly : Mon Aug 29 16:24:02 CEST 2016
/scratch/ag/Vitireseq_Fasta/HINGE_Test/vriparia_test: DBshow: command not found
Traceback (most recent call last):
File "/home/ag/rainman_home/HINGE/scripts/get_draft_path.py", line 101, in
Hi again,
I put the whole path to DBshow :
DBshow_cmd = "/home/ag/rainman_home/HINGE/thirdparty/DAZZ_DB/DBshow "+ filedir+'/'+ filename+' '+dbshow_reads
stream = subprocess.Popen(DBshow_cmd.split(),
stdout=subprocess.PIPE,bufsize=1,shell=True)
New error :
Run postprocessing: Mon Aug 29 16:37:34 CEST 2016 ############################# get draft assembly : Mon Aug 29 16:49:43 CEST 2016 Usage: DBshow [-unqUQ] [-w<int(80)>] [-m
Hi Alexis,
Would it be possible for you to send us the vriparia_testhinge_1.G2.graphml
that creates the error for us to debug that? The graphml file doesn't contain the sequence information, so you wouldn't be sharing any data (in case that's a concern for you).
Thanks, Govinda.
Hi Alexis,
It seems that the DBshow command is not running properly. If you run
/home/ag/rainman_home/HINGE/thirdparty/DAZZ_DB/DBshow $PBS_O_WORKDIR/vriparia_test 121
does it print the sequence for read 121?
Hi,
This command print :
m000_000/124/0_14063 acggttagctttaaaaaaaacataatctttgtggaaaaggtgttttaagctaaccatataaataataactgattttaaat ...etc ...
Hi,
It seems that there is a shift between the number of the read in the DBshow command and the identifier in the fasta output I've tested From 1 to 12 ==> correct From 13 to 76 ==> shift +1 example :
ag@jarvis:~$ /home/ag/rainman_home/HINGE/thirdparty/DAZZ_DB/DBshow /scratch/ag/Vitireseq_Fasta/HINGE_Test/vriparia_test 13
m000_000/14/0_12805
From 77 to 81 ==> shift +2 From 81 to ==> shift +3
At 1000 the shift is 22 , etc etc
Does this shifting cause the problem ?
I don't think this shifting is causing the problem.
The fact that when you run get_draft_path you get
Usage: DBshow [-unqUQ] [-w] [-m]+ path:db|dam [ reads:FILE | reads:range ... ]
as part of your output suggests that the script is not calling DBshow in the correct manner. Can you modify the script to print the string DBshow_cmd (line 22) and show me what it prints? It's going to be a long string, as it includes thousands of read ids, but the beginning is what matters. Thanks.
Hi,
Here is the beginning of the DBshow output :
/home/ag/rainman_home/HINGE/thirdparty/DAZZ_DB/DBshow /scratch/ag/Vitireseq_Fasta/HINGE_Test/vriparia_test 121 680 767 883 958 1035 1407 1568
I attach also the whole output.
Thanks again DBshow_cmd_output.txt
Hi Alexis, if you just run
/home/ag/rainman_home/HINGE/thirdparty/DAZZ_DB/DBshow /scratch/ag/Vitireseq_Fasta/HINGE_Test/vriparia_test 121 680 767 883 958 1035 1407 1568
do the sequences appear properly?
This command looks fine, but above you said that when you ran get_draft_path, it was printing
Usage: DBshow [-unqUQ] [-w] [-m]+ path:db|dam [ reads:FILE | reads:range ... ]
. This should only be printed if there was a problem with the command DBshow_cmd
yes this command shows the sequence properly
But when you run the script you are still getting
Usage: DBshow [-unqUQ] [-w] [-m]+ path:db|dam [ reads:FILE | reads:range ... ]
?
Ah! The problem seems to be in the modification of the command.
It should be
DBshow_cmd = "/home/ag/rainman_home/HINGE/thirdparty/DAZZ_DB/DBshow "+ filedir+'/'+ filename+' '+dbshow_reads
stream = subprocess.Popen(DBshow_cmd,
stdout=subprocess.PIPE,bufsize=1,shell=True)
instead of
DBshow_cmd = "/home/ag/rainman_home/HINGE/thirdparty/DAZZ_DB/DBshow "+ filedir+'/'+ filename+' '+dbshow_reads
stream = subprocess.Popen(DBshow_cmd.split(),
stdout=subprocess.PIPE,bufsize=1,shell=True)
When using shell=True
, the arguments should not be split. This stackexchange answer explains it.
Unfortunately yes :(
Run postprocessing: Wed Aug 31 20:57:49 CEST 2016 ############################# get draft assembly : Wed Aug 31 21:11:57 CEST 2016 Usage: DBshow [-unqUQ] [-w<int(80)>] [-m
any idea ?
Sorry didn't see your last comment :/ I'll try
Great news : It works ! :)
############################# Run postprocessing: Wed Aug 31 21:18:07 CEST 2016 ############################# get draft assembly : Wed Aug 31 21:31:43 CEST 2016 [2016-08-31 21:31:46.789] [log] [info] draft consensus [2016-08-31 21:31:46.789] [log] [info] name of db: vriparia_test, name of .las file vriparia_test.las [2016-08-31 21:31:46.789] [log] [info] name of fasta: , name of .paf file [2016-08-31 21:31:46.789] [log] [info] filter files prefix: vriparia_test [2016-08-31 21:31:46.789] [log] [info] output prefix: vriparia_test.draft [2016-08-31 21:31:46.790] [log] [info] Parameters passed in
Despite I read that it is not a good thing to use shell=True (https://security.openstack.org/guidelines/dg_avoid-shell-true.html) ...
I will keep you informed about the rest of the pipeline
Thanks again
Hi,
Still trying to work with HINGE on my data set. Here is a test on a single SMRT Cell. All previous steps went well. But now i'm stucked with :
`Run postprocessing: Mon Aug 29 11:10:47 CEST 2016
get draft assembly : Mon Aug 29 11:23:49 CEST 2016 Traceback (most recent call last):
File "/home/ag/rainman_home/HINGE/scripts/get_draft_path.py", line 24, in
stdout=subprocess.PIPE,bufsize=1)
File "/module/apps/python/2.7.9/lib/python2.7/subprocess.py", line 710, in init errread, errwrite)
File "/module/apps/python/2.7.9/lib/python2.7/subprocess.py", line 1335, in _execute_child raise child_exception
OSError: [Errno 2] No such file or directory
[2016-08-29 11:23:50.855] [log] [info] draft consensus
[2016-08-29 11:23:50.855] [log] [info] name of db: vriparia_test, name of .las file vriparia_test.las
[2016-08-29 11:23:50.855] [log] [info] name of fasta: , name of .paf file
[2016-08-29 11:23:50.855] [log] [info] filter files prefix: vriparia_test
[2016-08-29 11:23:50.855] [log] [info] output prefix: vriparia_test.draft
[2016-08-29 11:23:50.855] [log] [info] Parameters passed in`
My lines of script :
echo "#############################" echo "Run postprocessing:"
date` python2.7 /home/ag/rainman_home/HINGE/scripts/pruning_and_clipping.py vriparia_test.edges.hinges vriparia_test.hinge.list hinge_1echo "#############################" echo "get draft assembly :"
date
python2.7 /home/ag/rainman_home/HINGE/scripts/get_draft_path.py $PBS_O_WORKDIR vriparia_test vriparia_testhinge_1.G2.graphml /home/ag/rainman_home/HINGE/build/bin/consensus/draft_assembly --db vriparia_test --las vriparia_test.las --prefix vriparia_test --config /home/ag/rainman_home/HINGE/utils/nominal.ini --out vriparia_test.draft`I have updated this morning get_draft_path.py according https://github.com/fxia22/HINGE/commit/9ea413d3ddd71dec54d1fb5db5a1c7db606a7317
Thanks for your help