Number of Pacasus iterations per read in the cleaned data set?

Hi Maillinia,

That information is in the resulting read name. For each iteration a '_a1'/'_a2' or '_b1'/'_b2' has been added to the name at the end. The '1' stands for the left part of the original read and '2' for the right part. 'a' is used when the read is pretty much split in half and 'b' in all other cases. So if I start with a read named 'myPacBio' the resulting fasta file might contain:

myPacBio_b1 : left part of the read, first iteration
myPacBio_b2_b1: right part of the read, split another time. This is the left part (or: the middle part of the original sequence)
myPacBio_b2_b2: right part of the read, split another time. This is the right part

Originally: myPacBio = |myPacBio_b1||myPacBio_b2_b1|myPacBio_b2_b2||

Hence the number of _[ab][12] elements in the read name tells you how often it has been split.

swarris / Pacasus

Number of Pacasus iterations per read in the cleaned data set? #15