Closed jelber2 closed 2 years ago
Note that Effective Coverage is not effective coverage of the reference assembly but of the specific CCS/HiFi read based on its PacBio subreads.
with -m graph
and -k 21
added
Thanks for your interest on br
and result you show.
But what is your question ?
My apologies, there is no question. Thank you for developing a great tool! Just thought I would share some results.
No problem I just want be sure I didn't miss anything.
What is the meaning of "Mean Gap-Compressed Identity Phred" ?
@natir I am not entirely sure the exact meaning other than sharing the equation here - https://github.com/PacificBiosciences/harmony/blob/7f076ef4d3ea81d52502b052455398bbace7a818/scripts/single.R#L20 I could redo the analysis with https://github.com/PacificBiosciences/harmony/blob/7f076ef4d3ea81d52502b052455398bbace7a818/scripts/single.R#L21 (mean identity phred) tomorrow.
Ok it's seems a combination of all type of error.
for i in samples
do
echo $i
~/bin/minimap2-2.24_x64-linux/minimap2 -t 34 -a --secondary=no -x map-hifi \
/nfs/scistore16/itgrp/bioinf/projects/DA0030/2021_Aug_27/analysis.5/raw/pg_asm-0.4.10/Sample9/9-peregrine-2021-0.4.10-3x-circlator.fasta \
${i}.fasta 2>/dev/null|samtools stats |grep ^SN | cut -f 2- > ${i}.stats
done
rm -f test
for i in `ls *.stats`
do
j=`echo $i|perl -pe "s/.stats//g"`
echo "${i}" >> test
grep -Po "error rate:\t\S+" ${j}.stats |cut -f 2|awk '{printf "%0.1f\n",-10*log($1)/log(10);}' >> test
done
br_method PhredScore
gap_size.stats 40.0
graph-k21.stats 42.6
graph.stats 41.0
greedy.stats 26.9
one.stats 35.4
original.stats 26.9
two.stats 31.0
@natir The table above is the error rate converted to Phred from samtools stats
. Code is for only showing actual analysis steps.
FYI - following https://github.com/PacificBiosciences/harmony/issues/1#issuecomment-1025682192 using E coli ~30x PacBio CCS reads