Adds optional splitting functionality during HTML generation
The standard report/latest report is always generated
If the splitting flag is used, the dataset is split out
If we detect a consistent pattern at the start of all external IDs (YYW, where Y is a numerical value, and YY indicates a year 20YY), split those into multiple separate reports, each having a name subset_YYW.html and subset_YYW_latest.html
If no consistent pattern is found, the affected individuals are split arbitrarily into a number of separate reports subset_N.html and subset_N_latest.html, where N == the Nth split report.
There's no intention to register all these files in metamist, and it's expected that not all reports will require splitting
This is a naive split, with details like the cohort metadata/variant count statistics are not reduced down to this-subset only.
Fixes
Proposed Changes
20YY
), split those into multiple separate reports, each having a namesubset_YYW.html
andsubset_YYW_latest.html
subset_N.html
andsubset_N_latest.html
, where N == the Nth split report.This is a naive split, with details like the cohort metadata/variant count statistics are not reduced down to this-subset only.
Checklist
This definitely needs some testing