Open hyanwong opened 1 year ago
I agree, good to systematise this.
I'll have a go if I get a chance.
It would be nice to do this, but I think we should pull some of the code in the plots.py into notebooks, following the other sections. In particular, we should run dendroscope and store the newicks in the git repo (in the data dir).
Running the plots should only take a few seconds, anything that requires computation should be done in a notebook and cached.
I've added some automation in the notebooks directory for running the notebooks, and some of these already export data files. We can further systematise then.
in particular, we should run dendroscope and store the newicks in the git repo (in the data dir)
Yeah, good plan. Shall I make an extra notebook for this, then? Or I guess I could add it into the (already existing) Cophylo-treecmp.ipynb
Whichever makes most sense. Note updated notebook names (coinciding with section number for findability) though
It would be handy for someone other than me to write a makefile for the figures in the repo ('cos I'm rubbish at writing Makefiles, and it will take me ages).
From my point of view, we would assume that the two files
upgma-mds-1000-md-30-mm-3-2022-06-30-recinfo-il.ts.tsz
andupgma-full-md-30-mm-3-2021-06-30-recinfo-il.ts.tsz
will have been put in thedata/
directory by the user, and the Dendroscope and chromium binaries are available somewhere (I'm not sure how to encode that in the plot.py file: currently it's hardcoded, sorry). Everything else can be automated, I hope. This includes:nextstrain_ncov_gisaid_global_all-time_timetree-2023-01-21.nex.gz
running(https://github.com/jeromekelleher/sc2ts-paper/issues/120#issuecomment-1563004634)make_csv_files.py
to create e.g.data/breakpoints_long_2022-06-30.csv
python plot.py all
(which should create the figure PDFs)Obviously the locations of the stored files, names of directories, etc can be changed in whatever way seems most logical.