quantified-uncertainty / squiggle

An estimation language
https://squiggle-language.com
MIT License
148 stars 22 forks source link

Value serialization #3158

Closed berekuk closed 2 months ago

berekuk commented 3 months ago

Fixes #2081 Fixes #1198

Not working in browser yet, but CLI is functional, for the value types that I've implemented so far, including lambdas.

This response was generated in a worker thread (ProjectItem passes AST+externals to it), serialized by the worker, and deserialized back by the main thread:

$ pnpm --silent run cli run -e 'a=5; b={|x|x+a}; c = 1 to 2'
{a: 5, b: (x) => internal code, c: Sample Set Distribution}

TODO:

changeset-bot[bot] commented 3 months ago

⚠️ No Changeset found

Latest commit: 6f23f20e080c740faa3cdacc47eaab01f2fb8871

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel[bot] commented 3 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
quri-hub ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm
quri-ui ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm
squiggle-components ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm
squiggle-website ✅ Ready (Inspect) Visit Preview May 2, 2024 8:57pm
berekuk commented 3 months ago

This is now semi-functional in components storybook, with these caveats:

Still, it's good enough to try and confirm the expected benefits: editing a slow code with autoruns enabled doesn't freeze the editor 🎉

berekuk commented 2 months ago

On performance: serialization of JSON from the worker back to the main thread is slower than I expected.

As an example, in CLI, List.upTo(1,3000) -> map({|i|2 to 3}) (3000 samplesets of 1000 samples) takes 0.7 seconds (on M1 Pro). Around 0.2s is for getStdlib(), but the remaning 0.5 seconds is still higher than 0.18s it takes on the old version.

I don't know yet if it'll be similar in the browser, Node's worker_threads are implemented differently from webworkers.

This is not a blocker, because this example is pathological (we'd have trouble rendering 3000 samplesets in React, anyway), and because unblocked main thread is more valuable, so user experience will be better, but if timings in the browser are similar, then it will cause more CPU overhead.

berekuk commented 2 months ago

Progress since my last comment:

Things to do:

berekuk commented 2 months ago

I'll do web workers in a separate PR, since value serialization is already useful as-is for other purposes (e.g. server-side caching).

I've implemented EmbeddedWithSerializationRunner which does the same thing as EmbeddedRunner. EmbeddedRunner does the same thing as we did before — runs Squiggle in the main thread.

There's now a Select field in playground settings that allows to enable it, to test that it's fine:

Screenshot 2024-04-18 at 14 58 14

embedded is still default, because it's more performant; there's no benefit in using embedded-with-serialization, except for verifying that serialization works correctly.


My current code for testing that embedded-with-serialization is working is:

List.upTo(1, 20) -> map({|i| List.upTo(1,10000)})

With embedded runner, it stays around ~10ms; with embedded-with-serialization it's ~50ms.


Another way to play with runners is through CLI:

$ pnpm --silent cli run -e '2+2'
4
$ pnpm --silent cli run -r embedded-with-serialization -e '2+2'
4
$ pnpm --silent cli run -r node-worker -e '2+2'
4
berekuk commented 2 months ago

I think this is ready for review.

There aren't that many tests, but non-default runners can be tested with SQUIGGLE_DEFAULT_RUNNER=... pnpm test, and they all pass.

(it'd be valuable to run all tests with different runners by default, but running everything 3 times when only SqProject part is affected is a bit too much, so I'm not sure how to do that)

I'll mark the most important parts of code in PR comments.

berekuk commented 2 months ago

Huh, the builds fail, I'll investigate.