zengxiaofei / HapHiC

HapHiC: a fast, reference-independent, allele-aware scaffolding tool based on Hi-C data
https://www.nature.com/articles/s41477-024-01755-3
BSD 3-Clause "New" or "Revised" License
142 stars 10 forks source link

consult #91

Open casusr opened 4 days ago

casusr commented 4 days ago

May I ask, I used the command samtools view -\@4 -f 4 -f 256-bq 20 to filter the bam file and generate the uniq.bam file? Then use the haphic pipeline p_ctg.fa uniq.bam 40 --RE GATC command and run bash juicebox.sh in the result, out_JBAT.txt does not exist or does not contain any reads

zengxiaofei commented 3 days ago

You may have misunderstood the meaning of the -f parameter in samtools view:

-f INT   only include reads with all of the FLAGs in INT present

If you do not have any specific requirements, I would recommend following the method provided in our documentation.

casusr commented 2 days ago

Sorry, I made a mistake. The samtools command I used was

samtools view -\@4 -F 4 -F 256 -bq 20

is this also not acceptable?

zengxiaofei commented 2 days ago

The filtering process can result in singleton reads, which might cause some problems in downstream analyses. In addition, I am uncertain if setting -bq 20 would be too stringent for your case. Therefore, I recommend starting with the method provided in our documentation, as it has proven effective in most cases.

casusr commented 2 days ago

Thank you very much for your answer