Open sgraf812 opened 2 years ago
Having pure state was the main goal and achievement of the interpreter. This will not be changed for sure because it would ruin readability and simplicity. Haskell simply needs a better compiler. IMO it is a seriously bad habit to make Haskell programs more imperative to gain performance. Instead improve the compiler.
Use staged compilation to make it faster. https://github.com/AndrasKovacs/staged
Please customize the interpreter for your needs. The idea is that one could specialize and refactor the interpreter easily to do experiments without worrying the code quality and instead focusing on the creative and research domain specific parts.
Having pure state was the main goal and achievement of the interpreter
Yes, and I agree that's a big deal. From what I heard, implementing our instrumentation ideas on top of your work was quite a breeze.
Instead improve the compiler.
A static analysis that reuses heap cells like that is non-trivial. I also live in the here and now, and at the moment we don't have such an analysis.
Use staged compilation to make it faster.
I agree that might be valuable path to explore, but that is not that much of a short-term solution. It is also unclear to me whether that even optimises away all the StgState
overhead.
What do you think about my second suggestion?
Segregate StgState into two (or more) records StgStateHot/StgStateCold. Put hot stuff like ssCurrentProgramPoint in StgStateHot. Bonus points for a record pattern synonym that keeps the old interface (but then call sites must be absolutely sure to inline away the PS)
I think that will go a long way towards less copying of large StgState
s and it won't impact customisability of the interpreter at all.
I do not plan to optimize the interpreter further. To me the interpreter should be a high level specification which is right now and it should not have optimization related noise at all. The reason why I stick to this idea is because I plan to do experiments where I use the interpreter as a specification literally and generate code from from it. (i.e. free monad based interpreter) So I need to keep the code simple. BTW you could optimize it for your custom research if you wish, just fork it. Do not look at the interpreter as a software product, so do not hesitate to do ad-hoc modifications on it, it's cheap. So please implement your optimization ideas by yourself in your fork.
A static analysis that reuses heap cells like that is non-trivial. I also live in the here and now, and at the moment we don't have such an analysis.
One of GRIN Compiler goal is is to experiment with such analyses and make it real.
Here's a profile of a simplified benchmark case of NoFib's
bernoulli
after #8 has been fixed:Most of the functions there are related to stack or heap manipulation. Looking at the code and the fact that
setProgramPoint
(which does only one thing: modify theStgState
'sssCurrentProgramPoint
) contributes almost 10% of all allocations, I think the lovely simple design of a singleStgState
which contains the whole interpreter state in a huge immutable record might be the next bottleneck.Unfortunately, we don't have mutable fields (yet) in GHC Haskell. So here are other suggestions:
StgState
STVar
s orMVar
s. Probably the most performant optionStgState
into two (or more) recordsStgStateHot
/StgStateCold
. Put hot stuff likessCurrentProgramPoint
inStgStateHot
. Bonus points for a record pattern synonym that keeps the old interface (but then call sites must be absolutely sure to inline away the PS)