peterjc / thapbi-pict

Tree Health and Plant Biosecurity Initiative - Phytophthora ITS1 Classifier Tool
https://thapbi-pict.readthedocs.io/
MIT License
8 stars 2 forks source link

Pooled marker counts for cutadapt, singletons for scripts/plot_reduction.py #565

Closed peterjc closed 1 year ago

peterjc commented 1 year ago

Curently scripts/plot_reduction.py only works on a single marker, not the pooled report files (et wants to access the Cutadapt and Singletons columns).

One way to fix this is to enhance the pooled report to include some or all of:

See https://github.com/peterjc/thapbi-pict/blob/v1.0.0/thapbi_pict/summary.py#L846 where these non-trivial fields are dropped:

    if len(markers) > 1:
        # TODO: How best to show the cutadapt and threshold values (per marker)?
        sample_stats = {
            sample: [values[i] for i in (0, 1, 5)]
            for sample, values in sample_stats.items()
        }
        stats_fields = tuple(stats_fields[i] for i in (0, 1, 5))
        assert stats_fields == ("Raw FASTQ", "Flash", "Control"), stats_fields
        blank_stats = [-1] * len(stats_fields)

However, the harder part is before this where the values are loaded and the dict usage means the final marker stats override any earlier marker stats.