jbloomlab / SARS2-mut-fitness

Observed substitution counts of SARS-CoV-2 compared to those expected under the mutation rates
MIT License
19 stars 5 forks source link

enable multiple mutation-annotated trees (datasets) #23

Closed jbloom closed 1 year ago

jbloom commented 1 year ago

This pull request enables the pipeline to make the fitness estimates for multiple mutation annotated trees.

The essence is that the config.yaml now has a mat_trees key which can specify several different mutation annotated trees each keyed by a name. The pipeline is then run for each of these, and the results are placed in a ./results_{mat}/ subdirectory.

In addition, one of the mat_trees is specified as the current mat and those results also go into ./results/.

The GitHub pages HTML rendering in ./docs/ then by default shows the current MAT, but also contains links to go to the HTML interactive plots for each individual MAT dataset too.

This change is therefore backward compatible (as ./results/ and the GitHub pages docs show the single current data set), but also now keeps results and interactive plots for other datasets.

This will be necessary both to archive older estimates, and to enable comparison between estimates with different data sets.