sstadick / perbase

Per-base per-nucleotide depth analysis
MIT License
115 stars 13 forks source link

Difference only-depth and base-depth #67

Open Akazhiel opened 11 months ago

Akazhiel commented 11 months ago

Hello!

I'm testing out this tool because I want to switch out from using mpileup and mosdepth doesn't fit my use case. When running both modes on the same data with the same settings I get different results. I'm not too sure of the difference between the modes but I was hoping of getting the same.

The commands run and the results are:

./perbase only-depth --bed-format -m -q 0 -z -r /Homo_sapiens_assembly19.fasta -t 6 -o test_perbase_onlydepth_matefixed.bed -F 3848 -b coverage.bed final.bam

7   117199644   117199645       151
7   117199645   117199646       144
7   117199646   117199647       143

and

./perbase only-depth -m -Q 1 -q 0 -z -r /Homo_sapiens_assembly19.fasta -t 6 -o test_perbase_basedepth.txt -F 3848 -b coverage.bed final.bam

REF POS REF_BASE    DEPTH   A   C   G   T   N   INS DEL REF_SKIP    FAIL    NEAR_MAX_DEPTH
7   117199644   T   133 0   0   0   6   0   0   127 0   0   false
7   117199645   C   129 0   0   0   0   0   0   129 0   0   false
7   117199646   T   129 0   0   0   0   0   0   129 0   0   false

Thanks in advance!

sstadick commented 11 months ago

Hello!

Could you clarify which commands you used? I'm assuming based on ouput format that the first is is only-depth and the second is base-depth. Could you also share a subset of data to reproduce what you're seeing above?

I'm not surprised that they are producing different outputs. At first glance I want to say it's how mate-fixing is handled between the two, but it's been years since I looked at this code. If you can share data I'll do some more digging. You could also try running without mate fixing and/or without read filtering to see which of those two might be to blame.

Akazhiel commented 11 months ago

Hello!

Apologies, I sort of copy-paste and fixed the commands but yes, the 2nd one is base-depth and the 1st one is only-depth.

Unfortunately I can't share any data due to it being private data. I'll try disabling the mate fix option and check how it behaves. The option for read filtering which one are you referring to?

Best regards, Jonatan

sstadick commented 11 months ago

No worries! Yes, the -m and -F options, try running with without / with differing combos of those which should hopefully narrow down the source of difference.