Open axch opened 8 years ago
Some data on how helpful this is likely to be. I profiled the command
venture puma -e
"assume x = normal(0, 1);
observe normal(x, 1) = 2;
infer mh(default, one, 300000);"
and, if I am eyeballing the results correctly
Based on this, I estimate a 2x improvement on this workload, and corresponding amplification of the value obtained from fixing other hot spots.
From conversation with @riastradh-probcomp:
T foo
in each function signature, change the signature to const T & foo_ref
, and add to the implementation T foo = foo_ref;
. This conserves the semantics of the program, since it just makes the copy explicit; but now, eliminating those copies is a local operation within the body of each individual function.Partial work to do this is on branch 20160527-riastradh-pumacopy.
Problem: C++ makes it difficult to usefully change the signature of a virtual member function with a default implementation in an abstract base classes, because the actual name of the function includes its type, so changing the type merely renames it, and any derived classes defining member functions with the same nominal name just stop overriding it.
For example, the PSP
class has a virtual member function double logDensity(VentureValuePtr value, shared_ptr<Args> args)
with a default implementation returning zero. Many derived classes of PSP
override this logDensity
function. If we change it to double logDensity(const VentureValuePtr &, const shared_ptr <Args> & args)
, then the derived classes no longer override it -- which would be hunky-dory if the compiler reported this as an error, but instead callers using the virtual member function of PSP
now get the default implementation instead of the intended overridden one.
In contrast, the virtual member function VentureValuePtr simulate(shared_ptr<Args> args, gsl_rng * rng)
of PSP
has no default implementation, and is marked purely virtual with =0
, so any derived class failing to override it is reported immediately by the compiler -- which makes it much easier to change the signature to VentureValuePtr simulate(const shared_ptr<Args> & args, gsl_rng * rng)
and reliably update all the derived classes trying to override it.
(C++11 allows a member function of a derived class to be tagged override
, so that the compiler rejects it if it doesn't actually override a member function of a base class. But we're not using C++11.)
The obvoius next step would be to split the default implementations of all these methods into separate base classes that can be used in combination with PSP
to avoid default implementations in PSP
itself. The same goes for various other abstract base classes: VentureValue
, DB
, Trace
, &c.
For the particular case of PSP
, Lite has a better thought-out class hierarchy, with a handful of base classes with appropriate names that supply appropriate default implementations.
Is it possible to instead grep for all those override definitions and change them in tandem, one method of the hierarchy at a time?
I'm not confident in the outcome of grep -- I'm much more confident in getting the compiler to double-check my work at compile-time.
I looked at lite a little bit and found it not totally obvious how the base classes are organized and what behaviour the various options suppress -- I'll need some help to confidently identify the correct semantic partitioning.
For example, puma's PSP::canAbsorb has a default method that returns false, which presumably suppresses logDensity, logDensityOfData, incorporate, and unincorporate, which in turn sounds like what we call a 'likelihood-free (P)SP'. However, lite's PSP.canAbsorb is not implemented, rather than default false, and lite's LikelihoodFreePSP overrides only canAbsorb (to return false), not logDensity, logDensityOfData, incorporate, or unincorporate.
Here's the partitioning I think is intended, along with some names for what the base classes that could supply default virtual methods are called:
canAbsorb
--
If false, called likelihood-free. If true, called ???.
childrenCanAAA
--
If false, called ???. If true, called deterministic-maker-aaaugh?
Not quite.
incorporate
is a generator of an implicit Abelian group that acts on the auxiliary state, and calling unincorporate
is the inverse). If these are nontrivial, the SP is called "[exchangeably] coupled" but, in the current system, they are always called.The interim change I proposed requires each PSP to explicitly say 'no, I'm not a continuous PSP' or 'no, I don't have a delta kernel'. An alternative would be to use dynamic_cast and run-time type information (RTTI) with different base classes for each additional set of functionality a PSP might have. Something like this:
class PSP {
public:
virtual VentureValuePtr simulate(...) const =0;
};
class AssessablePSP: public virtual PSP {
public:
virtual double logDensity(...) const =0;
virtual double logDensityOfData(...) const =0;
};
class MyPSP: public PSP, public AssessablePSP {
public:
VentureValuePtr simulate(...) { ... }
double logDensity(...) { ... }
double logDensityOfData(...) { ... }
};
Usage:
PSP * x = ...;
AssessablePSP * x_assess;
if ((x_assess = dynamic_cast<AssessablePSP *>(x))) {
.... x_assess->logDensity(...) ...
} else {
// non-assessable, fall back to whatever else
}
This still requires organizing the extra sets of functionality into additional base classes, but it does not require changing every PSP to list yes-or-no for every extra set of functionality it might have.
Looks reasonable.
Complication with the AssessablePSP proposal above: the decision of whether the PSP is assessable depends not on the PSP itself but also on some extra parameters to canAbsorb.
We could instead do something like this:
struct PSP {
virtual Assessor * canAbsorb(trace, appNode, node);
...
};
PSP * x = ...;
Assessor * a = x->canAbsorb(trace, appNode, node);
if (a) {
... a->logDensity(...) ...
} else {
// non-assessable, fall back to whatever else
}
(This API is also safer in that you can't accidentally call logDensity if you didn't already call canAbsorb.)
I worry about this complicating the PSP-implementor-facing interface. Perhaps that can be alleviated by the convention of returning this
from canAbsorb
and having the PSP itself implement the Assessor
interface in the assessable case.
Yes, that will certainly work -- that's even what I had in mind.
Bleh. Not every call to logDensity is immediately preceded by canAbsorb. E.g.,
As Taylor noted in code review yesterday, Puma is full of accidental use of C++ copy constructors in places where passing objects (including asymptotically large objects like vectors and sets) by constant reference would do. Going through and changing all such places to be constant references should both reduce incidence of bugs involving unexpected (non-)mutation, and improve performance. No measurements have yet been done, but anecdotally the performance gains could be quite substantial.
Marking blocked on #429 because if I were to do this I would want to do a style cleanup first (80-character line limit!).