BlanchetteLab / HIFI

Hi-C Interaction Frequency Inference (HIFI): High-resolution estimation of DNA-DNA interaction frequency from Hi-C data
23 stars 2 forks source link

Some issues with samtools when running the example datatset #15

Open princeps091-binf opened 3 years ago

princeps091-binf commented 3 years ago

Hello when trying to run the BAMtoSparseMatrix.py script with the example dataset I was unable to go through the samtool -view step. Following some digging through I noticed that the samtools error was : Malformed key:value pair at line 87: "@PG Hicup Mapper (version: 0.5.3)" samtools view: failed to add PG line to the header (I am using samtools version 1.12 (using htslib 1.12))

I figured that adding the argument --no-PG to line 148 of your BAMtoSparseMatrix.py script solved the issue: 148: convert_process=subprocess.Popen([samtools_exe,"view","--no-PG",args.bam_filepath],stdout=subprocess.PIPE,stderr=subprocess.PIPE)

I am not sure if this will have detrimental implications for the rest of the process but it seems to produce the expected sparse matrix.

Wondering if you had any better solutions to resolve this particular issue?

Thanking you in advance and congratulations on this great tool!

ccameron commented 1 year ago

@princeps091-binf HIFI was developed using an older version of SAMtools. Your suggestion is correct and shouldn't impact the results. BAMtoSparseMatrix.py has been updated to try the --no-PG command line option if the original command fails.