mobinasri / flagger

Evaluating genome assemblies
MIT License
58 stars 4 forks source link

samtools error: Chromosome blocks not continuous #24

Closed soisa001 closed 1 year ago

soisa001 commented 1 year ago

Running flagger_end_to_end.wdl via Cromwell with a slurm backend, I have an error in the call-preprocess step. It looks like the error is that the index cannot be created for the bam.

Inputs are the HPRC assemblies and aligned bam (minimap2) from https://s3-us-west-2.amazonaws.com/human-pangenomics/index.html?prefix=working/HPRC_PLUS/HG00733/. Specifically, HG00733.maternal.f1_assembly_v2_genbank.fa.gz, HG00733.paternal.f1_assembly_v2_genbank.fa.gz,and HG00733_20190925_EEE_m54329U_190607_185248.Q20.fastq.gz. hap1 and hap2toRefBam files are generated with asm2asmAlignment.wdl.

cat HG00733.maternal.f1_assembly_v2_genbank.fa.gz HG00733.paternal.f1_assembly_v2_genbank.fa.gz > combined.fa.gz minimap2-ax map-pb --threads 32 --cs --eqx -Y -L -I8g combined.fa.gz HG00733_20190925_EEE_m54329U_190607_185248.Q20.fastq.gz > read_alignment.sam samtools view -hb read_alignment.sam > read_alignment.bam

Input along with recommended FlaggerEndToEnd inputs are passed into cromwell as a json file (see attached inputs.json). Config is also attached.

Error message:

  • OPTIONS='--primaryOnly --minReadLen 5000 --minAlignment 5000 --maxDiv 0.02 --phasingLog /cromwell-executions/FlaggerEndToEnd/1082ae7f-76e1-4c86-b5fb-d4b30730c9ee/call-preprocess/runFlaggerPreprocess/f49c67ef-a8e5-4f8b-aa5b-4a1d08526a7d/call-correctBam/inputs/-1247867400/read_alignment_minimap2secphase_v0.3.0.phasing_out.txt --exclude read_alignment_minimap2.excluded_read_ids.txt'
  • correct_bam --primaryOnly --minReadLen 5000 --minAlignment 5000 --maxDiv 0.02 --phasingLog /cromwell-executions/FlaggerEndToEnd/1082ae7f-76e1-4c86-b5fb-d4b30730c9ee/call-preprocess/runFlaggerPreprocess/f49c67ef-a8e5-4f8b-aa5b-4a1d08526a7d/call-correctBam/inputs/-1247867400/read_alignment_minimap2secphase_v0.3.0.phasing_out.txt --exclude read_alignment_minimap2.excluded_read_ids.txt -i /cromwell-executions/FlaggerEndToEnd/1082ae7f-76e1-4c86-b5fb-d4b30730c9ee/call-preprocess/runFlaggerPreprocess/f49c67ef-a8e5-4f8b-aa5b-4a1d08526a7d/call-correctBam/inputs/-85231667/read_alignment_minimap2.bam -o output/read_alignment_minimap2.corrected.bam -n8
  • samtools index -@8 output/read_alignment_minimap2.corrected.bam [E::hts_idx_push] Chromosome blocks not continuous [E::sam_index] Read 'm54329U_190607_185248/28/ccs' with ref_name='HG00733#1#JAHEPQ010000069.1', ref_length=51427907, flags=0, pos=10296551 cannot be indexed samtools index: failed to create index for "output/read_alignment_minimap2.corrected.bam": No such file or directory

config_and_inputs.zip

mobinasri commented 1 year ago

Hi @soisa001 Did you sort the input bam file? I mean read_alignment.bam

soisa001 commented 1 year ago

Hi Mobin,

I didn't do that as it wasn't mentioned on the github page. Is it a required step? I would mention it in that case. I will try doing that now and let you know if it works.

Thank you

mobinasri commented 1 year ago

Sorry You're right just updated the README.