Improve performance for large sample sizes

probmods / webppl-viz

Visualization for WebPPL

http://probmods.github.io/webppl-viz/

Other

14 stars 9 forks source link

Improve performance for large sample sizes #30

Open longouyang opened 8 years ago

hawkrobe commented 7 years ago

Any ideas on how to start on this? Calling viz.marginals on a probmods example with >100,000 samples (and four variables in the joint distribution) takes an order of magnitude more time than inference

longouyang commented 7 years ago

There are a couple of bottlenecks:

The projected-out distributions are computed in a concise but inefficient way
Density estimation for continuous data is implemented naively. There are various tricks (e.g., FFT and tree-based computation, other stuff) to make this faster but it might not be worth the effort, given that we'll probably want kernel-based aggregators in core webppl anyway (probmods/webppl#369). Also, if I had to guess, the main bottleneck is probably the projection, not kde.

hawkrobe commented 7 years ago

Thanks for the tips -- I might take a look.