google / go-jsonnet

Apache License 2.0
1.63k stars 235 forks source link

Concurrency safe `FileImporter` #733

Open brancz opened 1 year ago

brancz commented 1 year ago

We embed the jsonnet VM in our CI/CD infrastructure and instantiate a VM per generation cycle, but found that when using the same VM instance for all generations it's quite a bit faster (I expect because of caching as many of the generations use the same files). However, at the same time we found the FileImporter to be not concurrency safe. Would you be open to adding locking to the file importer?

As I was writing this it also occurred to me, is the jsonnet VM otherwise concurrency-safe? This particular issue we found quickly during development, but I'm curious if it would be more work than just making FileImporter concurrency safe.

sparkprime commented 1 year ago

@sbarzowski would know better but I think we didn't make it thread safe at all. It might be possible to put locks throughout the VM. I think our reasoning was that the language itself was single threaded so there was no point allowing a VM to process two executions at once. What if you wrapped the whole thing in a single global lock, or just serialised your interactions with it?

brancz commented 1 year ago

That would mean we undo any gain we get from concurrently generating, which is very significant.

sparkprime commented 1 year ago

If the gains you got were from cacheing then you'd still get those gains on account of the cache being maintained between runs

brancz commented 1 year ago

That's separate, we both run generation per "stack" concurrently, and when we did that we saw an additional improvement when using the same runtime. We could write our own importer, but if the rest of the runtime is not concurrency safe then it's of little use.

sparkprime commented 1 year ago

I see.

Any chance you can look at go race detector to see if it's only a small number of places that need protecting?

Another option might be to create n VMs but have them all accessing the same import cache, which is a smaller unit of code to protect.

brancz commented 1 year ago

I can't believe I haven't thought of that, agreed, go race detector should be able to catch this with sufficient amounts of iterations. Will try!