Open JJBio opened 4 years ago
Each “-pe” will get its own histogram, so separating out the different libraries with different properties is a good move. Do not merge thing back together. Each sample should have its down “-pe” and “-sr”.
That said, you should use our new lumpy wrapper smoove
https://github.com/brentp/smoove
Then
Smoove makes all of those steps easy.
On Aug 5, 2019, at 10:41 AM, JJBio notifications@github.com wrote:
If I understood correctly, I cannot use lumpy express on a merged bam file with different RGs.
So, I have to have separate bam files (aligned with bwa mem -M -R ) and do this on each:
Sort & Index with Picard tools Mark duplicates with Picard tools Sort & Index with Picard tools Extract the discordant paired-end alignments Extract the split-read alignments Sort discordants & splitters with samtools sort Generate empirical insert size statistics for each bam file Then run lumpy like this
lumpy \ -mw 4 \ -tt 0 \ -pe id:sample,read_group:rg1,bam_file:sample.discordants.bam,histo_file:sample.lib1.histo,mean:500,stdev:50,read_length:101,min_non_overlap:101,discordant_z:5,back_distance:10,weight:1,min_mapping_threshold:20 \ -pe id:sample,read_group:rg2,bam_file:sample.discordants.bam,histo_file:sample.lib2.histo,mean:500,stdev:50,read_length:101,min_non_overlap:101,discordant_z:5,back_distance:10,weight:1,min_mapping_threshold:20 \ -sr id:sample,bam_file:sample.splitters.bam,back_distance:10,weight:1,min_mapping_threshold:20 \
sample.vcf what I am confused about is this line: -sr id:sample,bam_file:sample.splitters.bam,back_distance:10,weight:1,min_mapping_threshold:20 Do I merge the splitters from the different RG into one bam file? And back_distance:10,weight:1,min_mapping_threshold:20 are default parameters that should be kept regardless of the RG?
And my last question (sorry for the many questions): What do I do when I have a complex design: So, multiple libraries with different insert sizes with multiple lanes. Should I feed all of them separately? Or do I only treat different libraries as different RGs in this case? Thanks so much!!!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
If I understood correctly, I cannot use lumpy express on a merged bam file with different RGs.
So, I have to have separate bam files (aligned with bwa mem -M -R) and do this on each:
Then run lumpy like this
what I am confused about is this line:
-sr id:sample,bam_file:sample.splitters.bam,back_distance:10,weight:1,min_mapping_threshold:20
Do I merge the splitters from the different RG into one bam file? Andback_distance:10,weight:1,min_mapping_threshold:20
are default parameters that should be kept regardless of the RG?And my last question (sorry for the many questions): What do I do when I have a complex design: So, multiple libraries with different insert sizes with multiple lanes. Should I feed all of them separately? Or do I only treat different libraries as different RGs in this case? Thanks so much!!!