thecapitalistcycle / covid19model

Code for modelling estimated deaths and cases for COVID19.
MIT License
0 stars 0 forks source link

Python README #1

Open dentarthur opened 4 years ago

dentarthur commented 4 years ago

Hi @cc-a

I was very glad to see your pull request for making it easy to install the covid19model:

https://github.com/ImperialCollegeLondon/covid19model/pull/4

I'm writing this to you as an issue in my own forked copy to keep out of the way while making a follow up suggestion to you because you are at Imperial College and thinking about how to make access to the covid-19 response team's work easier.

I cannot act on the suggestion myself for reasons at end. So I want to keep out of the way while still making the suggestions.

Sugggestion

  1. Create a README addressed to pythonistas, linked from main README, requesting rapid port of exactly the same model from R (and Stan) to Python (and PyStan) using exactly the same inputs.

  2. I checked quickly through the forks before mine to see if anyone was already doing it and did not notice any, but I am reasonably certain people keen to do it will turn up very soon. The README addressed to them should suggest pull requests to add themselves to a list within that README of people starting projects to work on that so they will find each other quickly.

  3. That may be all that is needed to speed things up. But it should be done by somebody connected to the team at Imperial College in case there is further liaison needed. Obviously the epidemiologists and software engineers working with R based models and HPC clusters don't have any spare time at the moment, but there are an awful lot of python data scientists who can get to work on Dashboards, visualizations, Explorable Explanations etc once they can also run it with only tools they are familiar with and no R.

  4. I don't think a competent python data scientist would have much trouble porting it by initially just wrapping the R with Rpy2 and then step by step making it fully python while working identically. So it shouldn't need much input from Imperial College beyond what has already been done. From there it will find its way towards Explorable Explanations that can be easily embedded in web pages such as those here:

https://observablehq.com/collection/@observablehq/coronavirus

I think that will happen faster to reach a very wide web platform audience via python than via R with Shiny Dashboards as used among epidemiologists.

  1. But I'm not a competent python data scientist so I am just whispering in your ear.

PS

I'm also stuck in self-isolation typing this on an Android Tablet while setting up a replacement computer plus replacement email address after phone stolen simultaneous with lockdown here in Melbourne, Australia so cannot easily communicate at all for a few days.

harrisonzhu508 commented 4 years ago

Hi first of all, thank you for your suggestions.

  1. Agreed, and may link to what I suggest below.

2,3. It might be better in the long term to convert some of the R code (for the main base.r, excluding the plotting code) into Python as there is a Python interface[https://pystan.readthedocs.io/en/latest/] for Stan as well.

I can have a look later today or this week but feel free to start trying it out. It should be mostly just boiler-plate code for data wrangling to feed into Stan.

PS. Ouch - hope you manage your setup soon!

dentarthur commented 4 years ago

Thanks @harrisonzhu508

Yes, I mentioned PyStan in para 1. Certainly agree that R code should be converted. Initially "some", then quickly "all" so there is no dependency on R but only on PyStan.

My suggestion for starting with wrap of R in Rpy2 is simply to emphasize need to actually start by porting identical model that produces same outputs from same inputs (and can load future revised inputs in same formats).

I have downloaded Stan docs and will be reading. But no chance of me trying anything out until long after others have done what needs to be done quickly. Hence my offline whispering ;-)

Anyway, you or whoever takes up my suggestion to add a Python README will have a far better idea than me on what to put into it. I think pythonistas will run with their own ideas anyway. My main point is to speed up their coordination by establishing connections between those doing it via link from the main README to a list in the Python README of links to people working on a port using Python and PyStan instead of R so they do get in touch with each other and do so via Imperial College team connections.

Am now following you and look forward to keeping in touch as soon as I can.