tweag / chainsail

Replica Exchange sampling as-a-service
MIT License
11 stars 1 forks source link

Write announcement / beta test invitation texts / tweets #315

Closed simeoncarstens closed 1 year ago

simeoncarstens commented 3 years ago

We would need some well thought-through material to post on

A key point is to bring across nicely why Chainsail is currently closed-source.

Here's a draft for a PyMC3 discourse announcement:

Hi PyMC3 community, if you have ever struggled with multimodal probability distributions, this post might be interesting for you. As a reminder: a multimodal probability distribution has several modes or regions of high probability. You often have this situation if some parameters in your model are not identifiable: for example, in mixture models, you can switch the labels of the mixture components and you'll get exactly the same probabilities. If you have such a model, the advice often is to reparameterize it. If this is not possible and you're stuck with a multimodal probability distribution, you often run multiple chains in the hope that each chain discovers and explores different modes. Instead of using separate chains sampling the same distribution, you can also use chains which sample increasingly "flatter" versions ("replicas") of your probability distribution communicate with each other by exchanging states. This idea is known as "Parallel Tempering" or "Replica Exchange". If you're interested in the details, here's a shameless plug to a blog post of mine [link]. Replica Exchange is currently also implemented in TensorFlow Probability (https://www.tensorflow.org/probability/api_docs/python/tfp/mcmc/ReplicaExchangeMC?version=nightly). Colleagues and I over at Tweag created a web service called "Chainsail" (https://chainsail.io) providing a self-tuning, easy-to-use Replica Exchange implementation that uses cloud computing resources to scale up to a possibly larger number of replicas than would be feasible on a single machine. We're currently looking for probabilistic programming practitioners which might be interested in a private beta test. Chainsail is in an early stage and currently has two major limitations:

  • it doesn't easily interface with PyMC3 or other PPLs (although Stan models kind of work). The probability distribution to sample has to be provided in the form of a Python module with a simple interface, which our code then imports.
  • it uses only a very basic, naive and totally untuned HMC implementation. That being said, you can absolutely define your model using PyMC3, provided you find a way to expose its log-probability and the log-probability gradient. We are no PyMC3 experts, but maybe you'll find a way! If the beta tester feedback is such that we get the impression that Chainsail might be useful for the PyMC3 community and users of other PPLs, a proper interface to consume PyMC3 models and using a better HMC implementation would be among the next things we would work on. Currently, Chainsail is deployed on Tweag's premises and is available to beta testers for free (within reasonable computing time limits). If you like to give Chainsail a try, let us know in this thread or by email: support@chainsail.io so we can grant you access. Chainsail is currently closed-source, but depending on the community feedback we might make open-source at least parts of the service. We have a repository that provides additional information, example probability distributions and other related material: https://github.com/tweag/chainsail-resources If you have any questions about Replica Exchange, what the Chainsail service can and cannot do and if you'd like to test it, please don't hesitate to let me know in this thread or email us: support@chainsail.io
simeoncarstens commented 2 years ago

Next iteration :slightly_smiling_face:

Chainsail, a web service for sampling multimodal distributions: opinions and beta testers wanted!

Hi PyMC3 community,

Colleagues and I over at Tweag created a web service called Chainsail (https://chainsail.io) that can drastically improve sampling of multimodal distributions, which occur often in models with unidentifiable parameters or when you have ambiguous data. Chainsail has flexible support for models and probability distributions defined in PyMC, Stan, or hand-written Python. The secret sauce in Chainsail is an autotuning Replica Exchange algorithm that uses cloud computing to scale dynamically beyond the computing resources available on single machines. If you want to learn more about Replica Exchange, here's a shameless plug: I wrote a blog post about it.

We're currently looking for probabilistic programming practitioners who might be interested in beta-testing it. If you have multimodal distributions to sample, shoot us an email at support@chainsail.io to get your email address authorized! Don't hesitate either to tell us a bit about your sampling problem or ask us for a demo - we would be happy to chat and show you around. Currently, Chainsail is deployed on Tweag's premises and is available to beta testers for free (within reasonable computing time limits).

Future Chainsail development depends mostly on beta tester feedback, but faster Stan support, a better HMC implementation (possibly using BlackJAX) and more choices for the tempering schemes (for example, applying a temperature only to the likelihood) would be among the next things to work on. Chainsail is currently closed-source, but it is highly likely that we will eventually make at least parts of the service, if not all of it, open-source. If you'd like to learn more about Chainsail, here's a couple of additional resources:

Chainsail is in an early stage and currently has a couple major limitations:

If you have any questions about Replica Exchange, what the Chainsail service can and cannot do and if you'd like to test it, please don't hesitate to let me know in this thread or email us: support@chainsail.io. Looking forward to hearing your questions, opinions and ideas!

simeoncarstens commented 1 year ago

Done with the August 2022 release.