arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
309 stars 118 forks source link

lumpyexpress bedpe parameter #260

Open fpbarthel opened 6 years ago

fpbarthel commented 6 years ago

What are the usage instructions for the -bedpe parameter in lumpyexpress? This is unfortunately not clear from the examples. Is this the same as for the -B, -S and -D parameters? Eg.

lumpyexpress \
    -B tumor.bam,normal.bam \
    -S tumor.splitters.bam,normal.splitters.bam \
    -D tumor.discordants.bam,normal.discordants.bam \
    -bedpe tumor.cnvnator.bedpe,normal.cnvnator.bedpe \
    -o tumor_normal.vcf

The lumpyexpress --help output suggests you also need to provide sample IDs, but this seems redundant if they are also not provided for at -B, -S and -D?

usage:   lumpyexpress [options]

     -d FILE  bedpe file of depths (comma separated and prefixed by sample:)
              e.g sample_x:/path/to/sample_x.bedpe,sample_y:/path/to/sample_y.bedpe
fpbarthel commented 6 years ago

Another question on this, I am using (link, is the filename misspelled?) to generate the input BEDPE files for this parameter, however this generates two BEDPE files per sample, eg. tumor.del.bedpe and tumor.dup.bedpe. Should the deletions and duplications from a single sample be merged into one bedpe here?

Also, for the --breakpoint_size parameter supplied to this python script, should we use the same bin size as was used with CNVnator?

fpbarthel commented 6 years ago

Bump this thread? @ryanlayer ? (hope you don't mind the tag)

I get the Error: must specify depths as sample_id:bedpe even when I specify samples in the given format. I am using the same sample_id as in the BAM header in the SM tag.


barthf$ samtools view -H tumor.bam | grep '^@RG' | sed "s/.*SM:\([^\t]*\).*/\1/g" | uniq

barthf$ samtools view -H normal.bam | grep '^@RG' | sed "s/.*SM:\([^\t]*\).*/\1/g" | uniq

barthf$ lumpyexpress \
    -B tumor.bam,normal.bam \
    -S tumor.splitters.bam,normal.splitters.bam \
    -D tumor.discordants.bam,normal.discordants.bam \
    -bedpe TUMOR-SAMPLE-SM-TAG:tumor.cnvnator.bedpe,NORMAL-SAMPLE-SM-TAG:normal.cnvnator.bedpe \
    -o tumor_normal.vcf

Error: must specify depths as sample_id:bedpe

Here's my questions:

  1. Why is there an error message when I am supplying the parameter as suggested? turns out i was using lumpy -bedpe instead of lumpyexpress's -d which causes the problem, it gives an error related to bedpe rather than eg. "unknown parameter: -bedpe" so I didn't realize it until now
  2. Why do we need to supply sample ID ? Is this because BEDPE do not have header?
  3. Should the DUP and DEL BEDPE files output from be merged?
  4. What to use as --breakpoint_size for ?