Value serialization - Githubissues

berekuk commented 3 months ago

Fixes #2081 Fixes #1198

Not working in browser yet, but CLI is functional, for the value types that I've implemented so far, including lambdas.

This response was generated in a worker thread (ProjectItem passes AST+externals to it), serialized by the worker, and deserialized back by the main thread:

$ pnpm --silent run cli run -e 'a=5; b={|x|x+a}; c = 1 to 2'
{a: 5, b: (x) => internal code, c: Sample Set Distribution}

TODO:

[x] support all remaining value types and all dist subtypes (SampleSetDists are supported, but Symbolic and pointsets aren't)
[x] better serialization for lambdas and expressions (expressions contain values, this forced me to write more code than necessary, and AST is still heavily duplicated)
[x] check on TypeScript level that all serialized types are serializable (ideally, JSON-compatible)
[x] avoid vLambda calls in serialization — they're bad for deduplication
[x] error handling in web workers
[x] measure performance — spinning up a new worker for each run might be too expensive
[x] integrate with Next.js and browsers

changeset-bot[bot] commented 3 months ago

⚠️ No Changeset found

Latest commit: 6f23f20e080c740faa3cdacc47eaab01f2fb8871

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel[bot] commented 3 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
quri-hub	✅ Ready (Inspect)	Visit Preview	May 2, 2024 8:57pm
quri-ui	✅ Ready (Inspect)	Visit Preview	May 2, 2024 8:57pm
squiggle-components	✅ Ready (Inspect)	Visit Preview	May 2, 2024 8:57pm
squiggle-website	✅ Ready (Inspect)	Visit Preview	May 2, 2024 8:57pm

berekuk commented 3 months ago

This is now semi-functional in components storybook, with these caveats:

only in dev mode, prod build didn't bundle the worker properly
value serialization now works for almost all values (except Plots; todo)
exceptions are forwarded, but only as strings, so no stacktraces
web workers are created on each simulation, so there's a ~250ms overhead per simulation on my MBP M1 Pro
there's no debouncing and responses can arrive in the wrong order
there's a weird bug where typing something simple like x=1 causes an error at first

Still, it's good enough to try and confirm the expected benefits: editing a slow code with autoruns enabled doesn't freeze the editor 🎉

berekuk commented 2 months ago

On performance: serialization of JSON from the worker back to the main thread is slower than I expected.

As an example, in CLI, List.upTo(1,3000) -> map({|i|2 to 3}) (3000 samplesets of 1000 samples) takes 0.7 seconds (on M1 Pro). Around 0.2s is for getStdlib(), but the remaning 0.5 seconds is still higher than 0.18s it takes on the old version.

I don't know yet if it'll be similar in the browser, Node's worker_threads are implemented differently from webworkers.

This is not a blocker, because this example is pathological (we'd have trouble rendering 3000 samplesets in React, anyway), and because unblocked main thread is more valuable, so user experience will be better, but if timings in the browser are similar, then it will cause more CPU overhead.

berekuk commented 2 months ago

Progress since my last comment:

runners API (with EmbeddedRunner and NodeWorkerRunner)
errors and the remaning value types are serializable now
serialization supports multiple types, so we can serialize and deduplicate expressions too; not AST yet, but it's straightforward to add

Things to do:

bundle the worker in squiggle-lang (probably with esbuild)
- this should fix existing tests; write more tests
optional: implement WebWorkerRunner, remove web-worker dependency
use the worker in Next.js
run a long-lived worker (currently I spin up a new one every time, so it has to getStdlib() every time, and that's expensive)
make runners configurable in SqProject; possibly add a toggle in the playground to choose a runner

berekuk commented 2 months ago

I'll do web workers in a separate PR, since value serialization is already useful as-is for other purposes (e.g. server-side caching).

I've implemented EmbeddedWithSerializationRunner which does the same thing as EmbeddedRunner. EmbeddedRunner does the same thing as we did before — runs Squiggle in the main thread.

There's now a Select field in playground settings that allows to enable it, to test that it's fine:

embedded is still default, because it's more performant; there's no benefit in using embedded-with-serialization, except for verifying that serialization works correctly.

My current code for testing that embedded-with-serialization is working is:

List.upTo(1, 20) -> map({|i| List.upTo(1,10000)})

outer List is to collapse 20 nodes and avoid rendering too many React components
inner List is to give serializer enough work (you can also try it with other data types, e.g. lambdas, or produce errors)

With embedded runner, it stays around ~10ms; with embedded-with-serialization it's ~50ms.

Another way to play with runners is through CLI:

$ pnpm --silent cli run -e '2+2'
4
$ pnpm --silent cli run -r embedded-with-serialization -e '2+2'
4
$ pnpm --silent cli run -r node-worker -e '2+2'
4

berekuk commented 2 months ago

I think this is ready for review.

There aren't that many tests, but non-default runners can be tested with SQUIGGLE_DEFAULT_RUNNER=... pnpm test, and they all pass.

(it'd be valuable to run all tests with different runners by default, but running everything 3 times when only SqProject part is affected is a bit too much, so I'm not sure how to do that)

I'll mark the most important parts of code in PR comments.

berekuk commented 2 months ago

Huh, the builds fail, I'll investigate.

quantified-uncertainty / squiggle

Value serialization #3158

⚠️ No Changeset found