arq5x / lumpy-sv

lumpy: a general probabilistic framework for structural variant discovery
MIT License
307 stars 119 forks source link

Using priors with lumpy/lumpyexpress #128

Open bharatij opened 8 years ago

bharatij commented 8 years ago

Hi, I am using Lumpy to find structural variants. While reading paper I came across using 1000 genome SVs as a priors to use with Lumpy. I was trying to understand more about how to use priors with lumpy/lumpyexpress in the Lumpy documentation but so far I am not able to figure it out. Can you please shed more light on it? Thanks.

ryanlayer commented 8 years ago

Sure. To use priors you need to convert the regions that you want to "seed" the analysis to BEDPE format, then simply include those files using the lumpy -B option. Unfortunately lumpyexpress does not support this option, so you will need to use lumpy directly.

On Thu, Jul 21, 2016 at 11:32 AM, bharatij notifications@github.com wrote:

Hi, I am using Lumpy to find structural variants. While reading paper I came across using 1000 genome SVs as a priors to use with Lumpy. I was trying to understand more about how to use priors with lumpy/lumpyexpress in the Lumpy documentation but so far I am not able to figure it out. Can you please shed more light on it? Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/128, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlDUYnmi46DVE8OYQ0HgJNmN-i-JZdjks5qX61FgaJpZM4JSCMc .

Ryan Layer

bharatij commented 8 years ago

Thanks for the details. So if I have use deletion and duplication from 1000 genome project as a prior, I can make bedpe file from vcf using lumpy-sv/scripts/vcfToBedpe file and use the bedpe as a prior with lumpy. I was checking the the -B parameter you mentioned to include prior file but the documentation says it is output file parameter. Is it -bedpe parameter instead? Please correct me if anything is wrong. Once again, I appreciate your help.

ryanlayer commented 8 years ago

Unfortunately, vcfToBedpe is not going to help here. The format of the BEDPE is:

1 1 101 1 9900 10100 1 9999 + + TYPE:DELETION

that last TYPE tag is very important since lumpy needs that to properly cluster this breakpoint with the alignment. How to get your VCF into this format depends on the fields you have available. If you have "END" in your info field you can use something like:

bcftools query -f "%CHROM\t%POS\t%INFO/CIPOS\t%INFO/END\t%INFO/CIEND\n" you.vcf.gz

to get the field, then trasform those fields with awk, python, etc.

On Thu, Jul 21, 2016 at 1:22 PM, bharatij notifications@github.com wrote:

Thanks for the details. So if I have use deletion and duplication from 1000 genome project as a prior, I can make bedpe file from vcf using lumpy-sv/scripts/vcfToBedpe file and use the bedpe as a prior with lumpy. I was checking the the -B parameter you mentioned to include prior file but the documentation says it is output file parameter. Is it -bedpe parameter instead? Please correct me if anything is wrong. Once again, I appreciate your help.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/arq5x/lumpy-sv/issues/128#issuecomment-234356583, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlDUYKc7ulpf33O6Olr3bu6j6Zma4woks5qX8b-gaJpZM4JSCMc .

Ryan Layer