PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
251 stars 45 forks source link

ccs by-strand ip/pw order - documentation clarification #574

Closed gevro closed 1 year ago

gevro commented 1 year ago

Hello, Following up issue #532 , you previously said "Tags ip and pw always in the native orientation of the sequence."

However, the PacBio format documentation defines 'native orientation' as: "the same order and sense as collected from the instrument".

Therefore it is not clear what happens for ccs /rev strand reads in '--by-strand --hifi-kinetics' mode.

The question is: Does ccs output ip and pw in the same orientation as SEQ and QUAL for both /fwd and /rev strand reads?

Thanks.

armintoepfer commented 1 year ago

Since v6.4.0, CCS outputs pw and ip for single-strand reads with HiFi kinetics, which are in the same order / direction as they are collected from the instrument. We never reverse kinetics, as they'd be wrong then.

gevro commented 1 year ago

Thanks. However, does ccs ever reverse SEQ and QUAL when constructing by-strand consensus? i.e. are /fwd and /rev SEQ and QUAL also guaranteed in the same order/direction as they are collected from the instrument?

I just want to make sure that ip and pw are in the same order as SEQ QUAL for both /fwd and /rev strand ccs consensus (PRIOR to alignment).

armintoepfer commented 1 year ago

In CCS, for every ZMW, we separate subreads of the two strands as the very first step and then treat each strand as its own atomic unit, a by-strand ZMW. All alignments within that atomic unit will have the same native orientation as they are collected from the instrument.

gevro commented 1 year ago

Ok. So the conclusion is that for both /fwd and /rev ccs reads, SEQ, QUAL, ip, and pw will all be in the same native orientation collected from the instrument.

Thanks.