sequencing / isaac_aligner

Isaac Genome Alignment Software
Other
37 stars 8 forks source link

Alignment from fastq-gz gives small bams #11

Closed kevyin closed 10 years ago

kevyin commented 10 years ago

As input I've used a pair of fastq-gz wgs from HiseqX about 80G for R1 and R2 together.

The output bam sizes are about 1G only

kevyin commented 10 years ago

argc

/directflow/ClinicalGenomicsPipeline/resources/bin/isaac/isaac_aligner-iSAAC-01.14.04.17/bin/isaac-align --pf-only 1 --stop-at Finish --reference-genome /directflow/ClinicalGenomicsPipeline/work_dirs/work_runs/140815_ST-E00118_0067_BH0957ALXX/paddy/pipeline/tools/biodata_gi/genome_indices/gatk-resource-bundle/2.8/b37_indexed/isaac/01.14.04.17/IsaacIndex/sorted-reference.xml --bam-gzip-level 6 --gap-scoring bwa --keep-duplicates 1 --keep-unaligned back --mark-duplicates 1 --stats-image-format none --variable-fastq-read-length yes --ignore-missing-bcls 1 --ignore-missing-filters 1 --realign-gaps no --cleanup-intermediary 1 --temp-directory /tmp/IsaacFqzTemp_glsai/H0957ALXX_4_NA12878_S_Human__NA12878_S_TAO305A24PA1_Reference_DNA_samples.isaac_align_fqz.isaac_align_inputs --jobs 16 --output-parallel-save 16 --memory-limit 96 --output-directory phase0/H0957ALXX_4_NA12878_S_HumanNA12878_S_TAO305A24PA1_Reference_DNA_samples.isaac_align_fqz.isaac_align --base-quality-cutoff 15 --base-calls phase0/H0957ALXX_4_NA12878_S_HumanNA12878_S_TAO305A24PA1_Reference_DNA_samples.isaac_align_fqz.isaac_align_inputs --base-calls-format fastq-gz --sample-sheet /directflow/ClinicalGenomicsPipeline/work_dirs/work_runs/140815_ST-E00118_0067_BH0957ALXX/SampleSheet.csv --default-adapters Standard

come-raczy commented 10 years ago

A good starting point would be checking the percentage of reads passing filter in these fastq files.

kevyin commented 10 years ago

Hi Thanks for the reply The fastQC report and SAV for these show the data looks pretty good. I'm pretty sure our HiseqX run wasn't so bad that only 10% passed filter (1G instead of ~100G).

come-raczy commented 10 years ago

so what is the %PF then?

kevyin commented 10 years ago

For all lanes, about: PF ~70% Reads PF (M) 400 I did a bcl2fastq2 (reads where not indexed), then softlinked the files and ran isaac in fastq-gz mode.

I just did an isaac straight on the RunFolders and the bams turned out fine.

kevyin commented 10 years ago

Hi, Turns out I wasn't using the latest version iSAAC-01.14.04.17 upgraded to iSAAC-01.14.07.17 and bams are similar sizes now cheers