janet-lang / janet

A dynamic language and bytecode vm
https://janet-lang.org
MIT License
3.45k stars 223 forks source link

conditional heap buffer overflow via janetbyteview #1204

Closed CosmicToast closed 1 year ago

CosmicToast commented 1 year ago

Since Janet buffers are not null-terminated, any char*s out of a byteview (including from getcbytes) cannot be passed to native functions (as they will typically strlen or strdup those). Additionally, getcbytes (and previously, getcstring) itself runs strlen (since C99 does not define strnlen or strlen_s). The most obvious way out of this is by modifying JanetByteView to add a \0 to the end of the buffer (expanding it if needed), since that will often signal a conversion to cstring. This is primarily relevant to the C API, since most external libraries (I discovered this in Jurl when passing larger than initial buffer allocation JSON) take a plain char*. It's not really possible to move the content to a string since that is likely to get garbage collected before use. It's similarly not really possible to simply copy everything over, since that will then leak the memory. I'd appreciate alternative suggestions as well. In the meanwhile, I'm going to freeze all jurl arguments once I get around to patching it.

bakpakin commented 1 year ago

It's not really possible to move the content to a string since that is likely to get garbage collected before use.

The garbage collector only runs when the interpreter is running, so inside C Functions and other C code, the rug will not be pulled out from under you. Also calling into the interpreter from inside a C Function, callback, or similar will suspend the garbage collector for this reason.

janet_getcbytes was indeed buggy here, and this is why the original janet_getcstring did not allow passing in buffers. I've pushed a fix in f6248369fe4e1e5c92f0d20348256e4bab51c5f6 with the suggested solution as well as handling for the rare case where a buffer is marked as non-resizable.

CosmicToast commented 1 year ago

Technically, curl code is re-entrant - you setopt, and then you perform later. Jurl more or less relies on the incoming data types being captured so long as the handle is valid (either in a closure, or in request). So garbage collection can be a concern in such use-cases (basically, I'd expect getcstring/getcbytes output to be valid for as long as the input Janet value is).