cerebis / qc3C

Reference-free quality assessment for Hi-C sequencing data
GNU Affero General Public License v3.0
12 stars 1 forks source link

Failure to catch missing header exception #59

Closed cerebis closed 2 years ago

cerebis commented 2 years ago

I am running into the same error using the conda version of qc3C

qc3C bam --enzyme DpnII --fasta /home/jon/Working_Files/s_chloronotus/hapo-g/round_3/hapog_results/hapog.fasta --bam /home/jon/Working_Files/s_chloronotus/bwa-mem/aligned.SRR8499559.bam --max-obs 100000 --output-path /home/jon/Working_Files/s_chloronotus/qc3c/SRR8499559_w_bam/
INFO     | 2022-01-28 11:20:27,247 | qc3C.ligation | Loading cached cut-site database
INFO     | 2022-01-28 11:20:27,422 | qc3C.bam_based | Accepting all usable reads
INFO     | 2022-01-28 11:20:27,423 | qc3C.utils | Random seed was not set, using 1383103
INFO     | 2022-01-28 11:20:27,423 | qc3C.bam_based | Beginning analysis...
Pairs:   0%|                                                                                 | 0/100000 [00:00<?, ?it/s][E::idx_find_and_load] Could not retrieve index file for '/home/jon/Working_Files/s_chloronotus/bwa-mem/aligned.SRR8499559.bam'
ERROR    | 2022-01-28 11:20:27,429 |    main | 'HD'
Pairs:   0%|         

Originally posted by @JonEilers in https://github.com/cerebis/qc3C/issues/58#issuecomment-1024536793

cerebis commented 2 years ago

Hi @JonEilers

I have split your issue from the other, as I do not believe they are the same problem.

In your case, the header record appears to be missing from the BAM file and the error message from qc3C is terribly unhelpful, eg. it merely reports back 'HD', which is comically bad. My apologies.

Please try the following, to see if the HD record exists.

samtools view -H /home/jon/Working_Files/s_chloronotus/bwa-mem/aligned.SRR8499559.bam | grep '^@HD'

Are you using Samtools to create the BAM file?

cerebis commented 2 years ago

On review, the failure is in always expecting the HD record to exist in the pysam dictionary.

In the master branch, this condition will now report an informative error. It will be rolled into a new release in the near future.