DanielXMoore / Civet

A TypeScript superset that favors more types and less typing
https://civet.dev
MIT License
1.33k stars 28 forks source link

First version of comptime (synchronous, no outer scope) #1180

Closed edemaine closed 2 months ago

edemaine commented 2 months ago

This is a first version of comptime blocks based on Rust's comptime crate. Limitations:

The elephant in the room is security. eval always feels dangerous. On the other hand, I can't immediately think of a threat model where you're running an NPM script that runs civet for compilation and it couldn't run any other command. Still, if feels a little weird that compiling with the civet CLI could run arbitrary code. But this is also inherent to macros; see e.g. how Rust's comptime crate is implemented: it runs an external copy of Rust.

Another example is editing in VSCode. As you're typing code, it will immediately run the comptime blocks on every keystroke which could take down the language server (infinite loop) or other "fun" accidents.

I feel like, at the least, we need a flag that controls comptime, and maybe the default should be off. So maybe civet --comptime to enable, or in the unplugin there's a {comptime: true} option. I haven't added this yet because I wanted to get input first on what you think is best, but it's my current thoughts.

Alternatively, we could add a comptime phase separate in between parse/prune and generate. This would have the potential advantage of keeping those phases synchronous, so only the comptime phase needs to be asynchronous. But it would make it harder to use from an API perspective. Or maybe we offer multiple top-level APIs, synchronous which forbid comptime, and asynchronous which allows it?

Also:

bbrk24 commented 2 months ago

Output must JSON.stringifyable

This is a more serious limitation than just reference cycles and custom classes. JSON.stringify silently translates [1, undefined, 3] into [1, null, 3].

edemaine commented 2 months ago

Ah, I didn't know that one. A recent robust and fairly small serializer library I know of is seroval (created for use in Solid SSR I believe). Are we OK adding a dependency for this? Alternatively, we could probably make our own pretty easily with pattern matching, especially if we don't care about cycles and custom classes. Hmm, maybe our own is cool because we could plausibly fix custom classes one day, as we have influence over the compile and run time...

bbrk24 commented 2 months ago

Would this work as a starting point for rolling our own serializer?

STRd6 commented 2 months ago

This looks like a good start. I think it will take a bit of experimentation to figure out the best way to handle comptime but we should be able to get there incrementally.

Having comptime off by default with an option or flag to enable seems best to me, especially considering the potential LSP hanging issues. Eventually we may switch to a separate thread that can be cached/killed. This also mitigates some risk from civet ... shell commands having more capabilities than someone might expect. Being explicit lets people opt in and choose their trust level based on the code they are working with.

I looked briefly at https://github.com/lxsmnsyc/seroval but it still seems a bit large to include for this one feature, at least for now. Having our own simple one or even just documenting and living with the JSON limitations while we experiment is fine.

STRd6 commented 2 months ago

One other security note: if we enable comptime on the web playground we need to be sure to use a sandbox otherwise the sharing links can be used for XSS.

bbrk24 commented 2 months ago

otherwise the sharing links can be used for XSS.

Is that not already possible? Granted I guess it's currently gated behind the "Run" button.

edemaine commented 2 months ago

Yeah, requiring interaction vs. just following a link is a significant difference. On the Playground, maybe we add an additional checkbox to enable comptime which is only visible when the word comptime appears in the source code. I think we can continue to enable it in the Reference (maybe just on the comptime examples), but anyone editing those examples live would need to be a bit careful to not write an infinite loop. I'll look at this today.

bbrk24 commented 2 months ago

Would this work as a starting point for rolling our own serializer?

This sort of demonstrates its own flaw with allowing the serialization of functions: the captured hasOwn and circularRefDetection are not declared/initialized. As convenient as it would be to be able to have comptime functions in simple cases like (1+), I'm not fully convinced it's worth the potential errors from invalid captures.

edemaine commented 2 months ago

I think it's still potentially useful to serialize functions (though seroval doesn't even try). We can state the limitation in the documentation.

I've pushed a version (including @bbrk24 as coauthor) that supports Maps and Sets too [update: and Date and RegExp], which seems like it would be helpful. Security next!

edemaine commented 2 months ago

Security updates:

Question: Should comptime be settable via config files? Currently it's allowed. This seems like it might be helpful, and is maybe less surprising, but I'm not certain.

I also did some major restructuring that we've always (?) wanted to do for parser.hera, which moves a lot of code out of there into new files. @STRd6 Do you want to review these? It wouldn't hurt to have a second set of eyes.

Modulo potentially disabling comptime in config files, I think this is ready for a first merge, though I'll want to fix require soon if possible.

edemaine commented 2 months ago

Question: Should comptime be settable via config files? Currently it's allowed. This seems like it might be helpful, and is maybe less surprising, but I'm not certain.

What do you think about this one? The safer side would be to disable it, I guess, so it needs to be specified in the build step (CLI or unplugin or esbuild plugin etc.). Thinking about it more, this feels more robust, because we look for config files all the way up the parent, so I vaguely worry about an attack (or accident) where someone places a config file up the tree and then gets arbitrary code execution while compiling. More consistent to disable it I guess.

Ah, further justification: If it's enabled in the config file, the VSCode extension will live-run comptime blocks, which we don't want.