Closed tanglingfung closed 11 years ago
Paul; variant2 is a slimmed down pipeline where the initial alignment step is done via unix pipes. The step by step disk intensive processes in the initial pipeline did not scale with large whole genome sequences. As a result, some of the steps like trim_reads is not yet implemented there. Quality improvements in reads have made trimming less crucial, although this can always change as read lengths. Nick Loman has a good post on this from the perspective of assembly:
http://pathogenomics.bham.ac.uk/blog/2013/04/adaptor-trim-or-die-experiences-with-nextera-libraries/
Are you noticing specific problems, or just curious about new pipeline?
For the summary issues, can you paste the error message you're seeing? Thanks much.
Thanks Brad.
I am just curious about the new pipeline at the moment coz it sounds like the pipeline could handle exome scale work.
As for the summary stats, there is no error message but some metrics files are skipped. The variant calling also stops after snpeff annotation (the file ended with -effect.vcf) while I was expecting -annotated.vcf. Other things work fine. I think there should be don't setting issues on my end. I tried to look for them but do not know where to start.
On Thursday, April 25, 2013, Brad Chapman wrote:
Paul; variant2 is a slimmed down pipeline where the initial alignment step is done via unix pipes. The step by step disk intensive processes in the initial pipeline did not scale with large whole genome sequences. As a result, some of the steps like trim_reads is not yet implemented there. Quality improvements in reads have made trimming less crucial, although this can always change as read lengths. Nick Loman has a good post on this from the perspective of assembly:
http://pathogenomics.bham.ac.uk/blog/2013/04/adaptor-trim-or-die-experiences-with-nextera-libraries/
Are you noticing specific problems, or just curious about new pipeline?
For the summary issues, can you paste the error message you're seeing? Thanks much.
— Reply to this email directly or view it on GitHubhttps://github.com/chapmanb/bcbio-nextgen/issues/11#issuecomment-17027942 .
and I may have read your code wrong. It appears to me that it would do the recalibration before realignment, but GATK recommends doing realignment before recalibration. I am not sure if there would be any differences but is there any reasons to do recalibration before realignment? just curious
Paul; Thanks as always for all the helpful feedback. I'll try to take these points one by one:
-annotated.vcf
file: This is no longer generated and we use the snpEff VCF directly instead of the GATK annotation walker. GATK was not keeping up with snpEff revisions and did not appear to be offering a lot of value over the direct snpEff VCF.Hope this helps. Let me know if you have any more questions.
Thanks Brad. I wish I could be contributing more than giving feedback soon
so, for the variant2 pipeline, trim read is not recommended?
and for some reasons, i have issues generating insert size, duplicates and variant summary, where should I trace after for possible mistake on my side?
Thanks.