Illumina / interop

C++ Library to parse Illumina InterOp files
http://illumina.github.io/interop/index.html
GNU General Public License v3.0
75 stars 26 forks source link

Unit of cluster_count_pf? #310

Closed yfu-recursion closed 1 year ago

yfu-recursion commented 1 year ago

Do you know where I can find the unit of cluster_count_pf. The value seems to suggest it is in thousands, but I can't seem to find the corresponding document...

I see 3 versions defined here: https://github.com/Illumina/interop/blob/c98d2689941cd557e6dad43884ff12b55b3e327b/interop/model/metrics/tile_metric.h#L334-L349 but it looks like run_summary() only has cluster_count_pf() available.

>>> run_metrics = py_interop_run_metrics.run_metrics()
>>> run_metrics.read(run_folder)
>>> summary = py_interop_summary.run_summary()
>>> summary.at(0).at(0).cluster_count_pf().median()
2682877.5
ezralanglois commented 1 year ago

What version of InterOp is this? Also, why instrument and consumable type?

This should be a count of clusters. It is strange that we get a floating point value, but that is due to the fact that the median is interpolated. Given that we get a fractional value, suggests this is a random platform like MiSeq or HiSeq.

yfu-recursion commented 1 year ago

Hi @ezralanglois! This is NovaSeq 6000 and S4. I am using interop 1.1.23.

$ pip list | grep interop
interop                     1.1.23
ezralanglois commented 1 year ago

Apologies, I confused myself with cluster count and PF cluster count.

For the run and real level, we report the sum of all the tiles, so you have cluster_count or cluster_count_pf

For the lane level, we report statistics over all the tiles. So median means the median cluster count PF for all tiles in that lane. The unit is clusters, but the value is the median number of clusters per tile

yfu-recursion commented 1 year ago

@ezralanglois Thank you for the detailed reply!