CSB5 / lofreq

LoFreq Star: Sensitive variant calling from sequencing data
http://csb5.github.io/lofreq/
Other
100 stars 30 forks source link

Unable to run call-parallel on filenames with spaces #51

Open ozagordi opened 7 years ago

ozagordi commented 7 years ago

Hi, I tried to run lofreq like this

lofreq call-parallel --pp-threads 4 -f 'path with spaces and (parenthesis)/ref_sequence.fna' bam_file.bam -o calls.vcf

and I get a bunch of

/bin/sh: -c: line 0: syntax error near unexpected token `('

I know, I know... One should not have path with spaces, but Dropbox recently renamed its folder to Dropbox (Personal). I guess that protecting the file names with shlex.quote should do the trick.

andreas-wilm commented 7 years ago

Hi Osvaldo,

Does the same work with a non funny path? If yes, I'll have to work out a solution as per your suggestion.

Andreas

On 20 Jul 2017 16:55, "Osvaldo Zagordi" notifications@github.com wrote:

Hi, I tried to run lofreq like this

lofreq call-parallel --pp-threads 4 -f 'path with spaces and (parenthesis)/ref_sequence.fna' bam_file.bam -o calls.vcf

and I get a bunch of

/bin/sh: -c: line 0: syntax error near unexpected token `('

I know, I know... One should not have path with spaces, but Dropbox recently renamed its folder to Dropbox (Personal). I guess that protecting the file names with shlex.quote should do the trick.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/CSB5/lofreq/issues/51, or mute the thread https://github.com/notifications/unsubscribe-auth/ABC5CYAAm1-SJkZUd3Ar3imexmvcBQMwks5sP2pMgaJpZM4OeOgT .

ozagordi commented 7 years ago

Yes, a regular path works. I solved by protecting all my subprocess calls with shlex.quote/split, so I don't even have to pass shell=True. For the call to lofreq I copied the reference to the working directory.

(by the way, Dropbox... a name with space and parenthesis?)

andreas-wilm commented 7 years ago

Thanks @ozagordi (sorry just back from vacations). Am I assuming correctly that call (instead of call-parallel) itself works? I guess I really should use shlex. Any chance you've done that already and can issue a pull request? :)

ozagordi commented 7 years ago

Hi @andreas-wilm, I'm also just back to work. No, unfortunately I don't have code to issue a PR.

As per your questions, in code, I have a bunch of reads aligned to a reference file. Then I do this

import shlex
import subprocess

ref_file = '/Users/ozagordi/Dropbox (Personal)/Software/MinVar/minvar/db/consensus_B.fna'

cml = shlex.split('samtools faidx %s' % shlex.quote(ref_file))
subprocess.call(cml)

and this works

cml = 'lofreq call -f %s refcon_sorted.bam -o calls.vcf' % shlex.quote(ref_file)
subprocess.call(shlex.split(cml))

This doesn't

cml = 'lofreq call-parallel --pp-threads 2 -f %s refcon_sorted.bam -o calls.vcf' % shlex.quote(ref_file)
subprocess.call(shlex.split(cml))

and returns

INFO [2017-08-21 11:54:54,964]: Using 2 threads with following basic args: lofreq call -f /Users/ozagordi/Dropbox (Personal)/Software/MinVar/minvar/db/consensus_B.fna refcon_sorted.bam

CRITICAL [2017-08-21 11:54:54,969]: lofreq exited with error code '1'. Command was 'lofreq idxstats refcon_sorted.bam'. stderr was: 'b'[bam_idxstats] fail to load the index.\n''
Traceback (most recent call last):
  File "/Users/ozagordi/miniconda3/bin/lofreq2_call_pparallel.py", line 746, in <module>
    main()
  File "/Users/ozagordi/miniconda3/bin/lofreq2_call_pparallel.py", line 557, in main
    bam_bins = [Region._make(x) for x in bins_from_bamheader(bam)]
  File "/Users/ozagordi/miniconda3/bin/lofreq2_call_pparallel.py", line 269, in bins_from_bamheader
    sq_list = sq_list_from_bam(bam)
  File "/Users/ozagordi/miniconda3/bin/lofreq2_call_pparallel.py", line 246, in sq_list_from_bam
    raise OSError
OSError