jow- / ucode

JavaScript-like language with optional templating
ISC License
87 stars 24 forks source link

Draft: Add small ctypes library #121

Open TjeuKayim opened 1 year ago

TjeuKayim commented 1 year ago

Adding a library inspired by Python's ctypes. The OpenWRT package libffi is 11kB.

This implementation is still a draft.

Is a libffi wrapper something you want to include in this repository?

jow- commented 1 year ago

Great work! Please give me a few days to review this.

I don't see any reason not to add this module to the repository.

pktpls commented 1 year ago

I'd be interested to see this happen in order to interface with socket functions and cryptography libraries. I rebased this PR on ucode master and the example works (only had to work around a maybe-uninitialized false positive).

I'll try to get a little stupid DNS server working.

pktpls commented 1 year ago

I've added a way to expose errno via the vm struct, and have also put that maybe-uninitialized workaround on my branch: https://github.com/pktpls/ucode/tree/libffi

Found one issue where I'm not sure if the cause is in ucode, this PR, or between keyboard and chair:

There's some kind of difference between c.ptr(buf).as_int() and let p = c.ptr(buf); p.as_int(), only the latter works correctly.

let buf = struct.pack("16x");

// correct: bind(4, {sa_family=AF_UNSPEC, sa_data="\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16) = 0
let p = c.ptr(buf);
libc.bind(sockfd, p.as_int(), 16);

// bad: bind(4, {sa_family=0x8b /* AF_??? */, sa_data="\0\0\0\0\0\0\360W)\2\0\0\0\0"}, 16) = -1 EAFNOSUPPORT (Address family not supported by protocol)
libc.bind(sockfd, c.ptr(buf).as_int(), 16);

Note that I can't comment on the code quality and integration decisions of this PR, I'm just a bystander testing things out. Stuff seems to work though, so it's much appreciated!

TjeuKayim commented 1 year ago

There's some kind of difference between c.ptr(buf).as_int() and let p = c.ptr(buf); p.as_int(), only the latter works correctly.

Unfortunately, one can indeed easily introduce use-after-free bugs in this design. When declaring a variable, garbage collection happens when it goes out of scope. Without binding it to a variable, ptr_gc() already deallocates the memory that p.as_int() points to before bind() executes. See this code example:

const c = require("ctypes");
const foo_bind = addr => print("debug ", addr, ", ", hexenc(c.ptr(addr).ucv_string_new(8)), "\n");
const buf = "12345678";

let p = c.ptr(buf);
foo_bind(p.as_int()); // runs ok

foo_bind(c.ptr(buf).as_int()); /** causes AddressSanitizer: heap-use-after-free
READ of size 8 at 0x602000001e90 thread T0
    #0 0x7fa5c6449e0a in __interceptor_memcpy (/lib64/libasan.so.8+0x49e0a)
    #1 0x7fa5c6b592d5 in ucv_string_new_length /src/ucode/types.c:409
    #2 0x7fa5c5e65726 in ptr_copy_uc_string /src/ucode/lib/ctypes.c:228
(...)
freed by thread T0 here:
    #0 0x7fa5c64b9388 in __interceptor_free.part.0 (/lib64/libasan.so.8+0xb9388)
    #1 0x7fa5c5e654b7 in ptr_gc /src/ucode/lib/ctypes.c:184
    #2 0x7fa5c6b587c5 in ucv_free /src/ucode/types.c:283
(...)
previously allocated by thread T0 here:
    #0 0x7fa5c64ba097 in calloc (/lib64/libasan.so.8+0xba097)
    #1 0x7fa5c5e643eb in xcalloc /src/ucode/include/ucode/util.h:85
    #2 0x7fa5c5e6446d in xalloc /src/ucode/include/ucode/util.h:96
    #3 0x7fa5c5e64c69 in ctypes_new_ptr /src/ucode/lib/ctypes.c:90 */
pktpls commented 1 year ago

That makes a lot of sense, thank you! I had also already noticed that variables are pass-by-value, so buffers need to be kept close to the CFI invocation, and can't easily be initialized somewhere else and passed around. Which is fine now that I know about it :-)

Anyway, I had a pretty good experience with the ctypes module so far. :ship:

pktpls commented 1 year ago

Have you thought about support for callback arguments yet? I might give it a shot at a later point.

jow- commented 1 year ago

Conceptually it would be possible to let the ctypes module maintain a registry of pointer instances it produced, the registry would retain a reference to each pointer object and inhibit garbage collection. This would need to coupled with something like a new ptr.release() method to explicitly free the resource.

What do you think? Apart from that I believe the usage of ctypes can never be "safe" as you could still use c.ptr(...) to address arbitrary memory locations which you then subsequently dereference... so not sure if it would make sense to sacrifice code conciseness (add explicit memory management to pointers) to only solve halve of the issue anyway.

pktpls commented 1 year ago

Thanks for the "pointers" - yeah, my hunch was also that it'd be a little more involved. I feel like it's out of scope for this PR, it seems to work well enough as is.

I have at least one library in mind which requires callbacks (jech/dht), but we'll see if I even have a use for it :)

jow- commented 1 year ago

@TjeuKayim - I would like to do a new release soon and I consider including this library. Is there anything remaining to do? If not, could you squash the commits and mark this PR as finished so that I can merge it?

TjeuKayim commented 1 year ago

Is there anything remaining to do?

Support for callback arguments by wrapping the libffi Closure API can better wait until a separate merge request in the future, as I only made a small start on that feature.

Apart from that, this ctypes library lacks user documentation, and I'm not that confident about the code quality (especially the unsafe pointer casts with uc_cif_t.values and the too long functions).

This merge request is ready for a code review.

pktpls commented 1 year ago

Just in case you release immediately after merging and there's no window for a PR: please also consider adding errno :) I updated my commit to use uc_vm_registry_* which seemed like the appropriate place. You can pick these two commits:

If the pseudonymous sign-off is a problem, I'd be happy if someone else takes the diffs and runs with them :)

I'll try to get a little stupid DNS server working.

For the record here's some rough ugly code that creates and reads from a UDP socket: https://gist.github.com/pktpls/07e4313f2353a9341da4e4b82b2c8911 - I didn't get to the actual DNS serving yet.

yichya commented 1 year ago

Tried this with a really simple code to add a module to rpcd for invoking HTTP requests as a proof of concept: https://github.com/yichya/rpcd-mod-http

Looking forward to getting this feature merged soon