Closed rschmieder closed 3 years ago
Have you checked the PF cluster count? I would expect that to match.
The percentage could be off due to rounding errors.
The value for reads_pf
is a number, not a percentage. Do you have a code example on how to get the PF cluster count you suggested? I only see cluster_count_pf()
for read
and that is of type metric_stat
, which doesn't seem to provide a count or number (only mean, median, stddev).
Ah, apologies. I had it in my head that was the percentage, not the count (we should really have that in the name). That should match. Let me look into it.
Ok, yes you are correct. metric_stat
just provides the mean
per lane (and other less useful stats). So summing it up, only gives you the sum of the mean of each lane, which won't be the same as the cluster count.
Here is one way to get the PF cluster count you are looking for:
run_metrics = py_interop_run_metrics.run_metrics()
run_metrics.read(source_dir)
pf_cluster_count = 0
tile_metric_set = run_metrics.tile_metric_set()
for i in range(tile_metric_set.size()):
pf_cluster_count += tile_metric_set.at(i).cluster_count_pf()
print(pf_cluster_count)
There are some other ways, but they require considerably more code and are only interesting if you want to get many metrics.
Thank you, that example generates numbers matching the demux output. Any idea why the reads_pf
generates a different value? Are there other metrics that should be parsed using tile_metric_set
instead of run_summary
?
Sorry, I have not dug around in this part of the code base for a while. Those values should match.
This is a bug. I just reproduced it for a NextSeq2k run. We can fix this.
1403945256.0
1403945152.0
👍
Using the python library, I can use the following to parse the number of PF reads from the interop:
The resulting number for
total_reads_pf
does not match the number of reads in theDemultiplex_Stats.csv
output file. Sometimes it's higher, sometimes it's lower. The overall difference is small.Is this expected because the value is stored as a
float
(based on http://illumina.github.io/interop/classillumina_1_1interop_1_1model_1_1summary_1_1stat__summary.html) and the difference is caused by rounding errors?