aidenlab / juicer

A One-Click System for Analyzing Loop-Resolution Hi-C Experiments
http://aidenlab.org
MIT License
413 stars 181 forks source link

$splitdir/*res.txt does not exist when running mega_from_bams.sh #268

Closed pna059 closed 2 years ago

pna059 commented 2 years ago

When running mega_from_bams.sh using .bam files created by other than juicer toolkit (HiCUP here), the $splitdir is not available for generating inter.txt and further.

samtools view "$cThreadString" -F 1024 -O sam "${outputDir}"/mega_merged_dedup.bam | awk -v mapq=1 -f "${juiceDir}"/scripts/common/sam_to_pre.awk > "${outputDir}"/merged1.txt

........runs but produces an empty file while

samtools view -@30 -F 1024 -O sam mega_merged_dedup.bam | head
A00703:39:HTKMFDRXX:1:1101:14326:1031   99      chr4H   434501938       35      151M    chr6H   447625737       0       AGACAGAAGCTTTCGGCGACGGAAAAGTACTTTCGATGATATCCTGATTTTTTGTGGAATTTTTGGGGATATATAGGCGCAAACCCTAGGGCAAAGGAGGTCTAGGGGGCCCACAAGCCTGTGGGCCGCGGCCTCCCCCTGTCCATGGGGT    FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFF::FFF::FFFFFFFFFFFFF:FFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF    AS:i:0  XS:i:-50        XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:151        YT:Z:UU CT:Z:TRANS
A00703:21:HMMYTDMXX:1:1101:31349:1047   99      chr6H   60241394        35      151M    =       60241514        0       ACATATCTCTCTGTGTTATAACTGTTGCATGATGAATAGCATCCGGCATAATCATCCATCACCGATCCAATGCCTATGAGTCTTTCCTACTGGTCCTTGCTACATTACTTTGCCGCTACTGCTGTCACTGCTGCTACTATTACTTTGACGG    FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFF:FFF    AS:i:0  XS:i:-47        XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:151        YT:Z:UU CT:Z:CLOSE
A00703:21:HMMYTDMXX:1:1101:31349:1047   147     chr6H   60241514        38      150M    =       60241394        0       GCTGTCACTGCTGCTACTATTACTTTGACGGTAGTGTTGTTACTTTGCTGCTACTAGTTACTGTTGCTACTGCTGCTATCATACTACCTTGCTACTGATACTTTGCTGCACATACTATATCTTTCAGATGTGGTTGAATTGACAATTCAA     FFF,F,FFFFF:FFFFFFF:FFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FF:FFFFFFFFFFFFFFFFFFFFFFF     AS:i:0  XS:i:-76        XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:150        YT:Z:UU CT:Z:CLOSE
.......
sa501428 commented 2 years ago

Hi! The bams produced by juicer/encode have additional tags added for each read, so this is the expected behavior.

pna059 commented 2 years ago

But I only have the mega_merged_nodups.bam file from samtools merge that ran OK...... How can I finish the script and generate the .hic file from these replicates?

sa501428 commented 2 years ago

You may want to try using the advice here: https://groups.google.com/g/3d-genomics/c/c5F31sGQeAg/m/P3b31apUAQAJ

Also since this is a general question rather than a bug, let's move any followup questions to the forum instead of Github Issues.