Many of the current issues concern inference (#87 #86 #84 #85, ...)
At the risk of delaying the solving, wanted to start some discussion about rewriting inference with the current gempyor object structure.
Major benefits:
one-click installation once you have Python installed > much easier setup, easy to use GUI on Windows, a .exe installer, ... ... Currently our team spends a lot of time trying to install flepiMoP or trying to run it on the server because of the python/R co-dependence. It massively hinders our distribution.
Simpler architecture and maintenance: currently each basic functionality is written in Python (once) and in multiple R files, which makes it very hard to change anything on inference (see the current attempts to fit initial conditions, not finished, to fit without NPI, also not finished, and the test failing). Inference in Python would use one unified layer to the filesystem already defined in gempyor (gempyor objects do not write parquet themselves, but just throw what they want to write at a function that handles extension, subpop, filename, in all cases). It would be easier to change our baseline implementation.
Building a better inference architecture: this requires some thought and is not exclusive to a rewrite in Python, but we could inspire ourselves from standard inference packages in architecture and interact with the current ecosystem in Python. We could get diagnosis plots and criteria with Arviz, and sampler perturbation tuning from EMCEE which makes sampling more efficient in high-dimensional spaces. Benefit from parameters as an object as well, for plotting and command line.
Minor benefit
Performance: good integration with gempyor allows just rerunning some parts of the model, selecting outcomes that are needed to fit and not writing intermediate simulations to disk (a big overhead on docker or on scracht4)
Python HPC ecosystem: for cloud or local compute, python has some great high performance computing and big data library we could use. It's minor because R has a lot of strong points as well.
Drawback
it's quite some work. I'd estimate 2 weeks for feature parity
part of our team is fluent in R and that would reduce their ability to change inference (but inference was also changed by non-R folks recently and we have now more python devs (Koji, Pengcheng)
Many of the current issues concern inference (#87 #86 #84 #85, ...)
At the risk of delaying the solving, wanted to start some discussion about rewriting inference with the current gempyor object structure.
Major benefits:
Minor benefit
Drawback