jeromekelleher / sc2ts

Infer a succinct tree sequence from SARS-COV-2 variation data
MIT License
4 stars 3 forks source link

pango_recombinant_lineages_report is misleading #111

Open jeromekelleher opened 1 year ago

jeromekelleher commented 1 year ago

The pango_recombinant_lineages_report function doesn't give very meaningful results because it just chooses the first sample as representative, assuming that it's the earliest and therefore the "causal". The picture is much more complex in reality, with multiple origins and shared recombinants across the lineages.

The function should be rewritten to try and summarise this information.

https://github.com/jeromekelleher/sc2ts/blob/265de6cda61cfa892c64982fb440c47cf2a33aed/sc2ts/utils.py#L728