BradnerLab / pipeline

bradner lab computation pipeline scripts
Other
53 stars 47 forks source link

bamliquidator_batch (calling bamliquidator_regions) fails on valid .bed file #43

Closed semenko closed 9 years ago

semenko commented 10 years ago

Not sure what's up here. Going to try to debug more.

semenko@nucleosome:~/git/pipeline/bamliquidator_internal/bamliquidatorbatch$ ./bamliquidator_batch.py -r ~/bed-analysis-quick-trash/peak-union-with-100bp-window.bed -e 200 -o /ramcache/test-baml/ ~/bed-analysis-quick-trash/raw-data/Liquidating ~/bed-analysis-quick-trash/raw-data/gf-gd/GF6-TCR-GD_S10.bt2.srt.rmdup.bam (file 1 of 23)
ERROR   Unhandled exception: error parsing /[snip]/bed-analysis-quick-trash/peak-union-with-100bp-window.bed
Liquidation completed: 0.014897 seconds
Traceback (most recent call last):
  File "./bamliquidator_batch.py", line 508, in <module>
    main()
  File "./bamliquidator_batch.py", line 488, in main
    not args.quiet)
  File "./bamliquidator_batch.py", line 312, in __init__
    self.batch(extension, sense)
  File "./bamliquidator_batch.py", line 203, in batch
    raise Exception("%s failed with exit code %d" % (self.executable_path, return_code))
Exception: /home/semenko/git/pipeline/bamliquidator_internal/bamliquidator_regions failed with exit code 4
semenko commented 10 years ago

Hm, I think the .bed is missing the 4th (optional) strand column and failing the min_columns = 4 test.

charlesylin commented 10 years ago

We should set it to map to both strands (strand = .) in the absence of that column

Charles Y. Lin, Ph.D. Dana-Farber Cancer Institute Department of Medical Oncology charles_lin@dfci.harvard.edumailto:charles_lin@dfci.harvard.edu http://bradnerlab.com

On Thu, Nov 20, 2014 at 5:14 PM, Nick Semenkovich notifications@github.com wrote:

Hm, I think the .bed is missing the 4th (optional) strand column and failing the min_columns = 4 test.

— Reply to this email directly or view it on GitHub https://github.com/BradnerLab/pipeline/issues/43#issuecomment-63889900.

semenko commented 10 years ago

Scratch that -- it's just an optional "name" column that's missing. I think this should work fined filled with "NA", etc. -- unless I'm missing some critical reason for the name column in the bed. [part of the matrix output?]

semenko commented 10 years ago

Ah, ok, the name_column is preserved. Got it.

This might be worthy of a special warning, since technically anything after the first 3 columns is optional (i'd dropped some .bed peak names, figuring it would take the regions [chrXbpY] as names or something): http://genome.ucsc.edu/FAQ/FAQformat.html#format1