CDCgov / cfa-viral-lineage-model

Apache License 2.0
10 stars 0 forks source link

Package scope as pertains to data processing #59

Open afmagee42 opened 1 month ago

afmagee42 commented 1 month ago

Looking over #52 got me revisiting package and repo scope.

It is hard to disentangle evaluation and fitting, but a "package for evaluable lineage modeling" is possible, which would have standards for the data you can put in but wouldn't be able to create it. That is, it would require "clean data" from the flow chart and remove linmod.data.

On the other hand, a package that is only the modeling and evaluation (and perhaps plotting thereof) isn't directly usable. Having NextStrain interfacing baked in has been a big boon for making progress in development.

As it is, we already have pipeline folders and the linmod package/folder in this repo, so rather than carving out linmod into an entirely separate repo, or removing the non-linmod components, we could data-wrangling to its own one-script "package" (data or nextstrain-data).