Illumina / interop

C++ Library to parse Illumina InterOp files
http://illumina.github.io/interop/index.html
GNU General Public License v3.0
75 stars 26 forks source link

Python: `index_summary` : Cluster Count == # of seq * 2 of a PE run? #271

Open sklages opened 3 years ago

sklages commented 3 years ago

I haven't noticed before,

index_summary(run_metrics, level='Barcode') returns as Cluster Count the total number of reads of a PE run, not the acutal number of clusters (= single reads, R1 seq count, whatever). That's confusing ..

Is that intented?

This is interop-1.1.23.

ezralanglois commented 3 years ago

I think you mean a dual index run, and yes, this is annoying. As far as I know, it has always been this way and we will need to rev the major minor version because this potentially will be a breaking change for downstream applications.

sklages commented 3 years ago

No, simple paired end run, no matter if single index, dual index run. Cluster Count always seems to be the sum of sequences of R1 and R2...

Upper is the actual R1 fastq file, the lower is the outpzt from index_summary(run_metrics, level='Barcode'). image

ezralanglois commented 3 years ago

Ya, you are right. I misinterpreted the code.