broadinstitute / viral-ngs

Viral genomics analysis pipelines
Other
188 stars 67 forks source link

escape vcf path passed to snpeff #992

Closed tomkinsc closed 4 years ago

tomkinsc commented 4 years ago

quote vcf path passed to snpeff so directories with spaces and other escapable characters are read correctly

dpark01 commented 4 years ago

Wait -- is there an example case where this code change fixes something? I would expect this kind of thing to only matter if we were using os.system calls like we did many years ago, but as far as I can tell, this code ought to be calling subprocess.*call or Popen or some other thing where we're not actually calling it via a shell invocation (quotations should not be necessary, whitespace should be fine).

That being said, some of our util.misc.run* code is a bit complex, and since we already decided to drop py2 support (and require py >=3.5) I think we could probably simplify or remove a ton of lines from there...

tomkinsc commented 4 years ago

Yes, I encountered a file-not-found failure when running snpeff on a vcf with a space in its file path (why Google Drive uses Shared drives/ as a directory name name is beyond me). It appears to be passed correctly as a single argument by our code to snpEff, so the failure appears to be with snpEff or its wrapper. Quoting or escaping seems to resolve the issue.

dpark01 commented 4 years ago

Ah so, technically maybe this ought to be a PR in the upstream wrapper scripts or code of snpEff itself, but this is a workaround based on the assumption that snpEff's executable is a shell script.

tomkinsc commented 4 years ago

True; just opened a PR upstream to fix the snpeff wrapper; closing this PR.