HaxeFoundation / haxe

Haxe - The Cross-Platform Toolkit
https://haxe.org
6.14k stars 656 forks source link

[eval] https tests #11638

Open kLabz opened 5 months ago

kLabz commented 5 months ago

There's something wrong with mbedtls (or more likely how we use it), but I couldn't fix this yet.. :/

Also, hl on mac seems to have a similar issue :thinking:

kLabz commented 5 months ago

@Apprentice-Alchemist do you have an idea what we're doing wrong there that could cause this Unix.Unix_error(Unix.EBADF, "recv", "") error?

https://github.com/HaxeFoundation/haxe/blob/db842bfd6c815784f16da415d6718bcb7d393a66/libs/mbedtls/mbedtls_stubs.c#L447-L450

https://github.com/HaxeFoundation/haxe/blob/db842bfd6c815784f16da415d6718bcb7d393a66/std/eval/_std/sys/ssl/Socket.hx#L27-L37

https://github.com/HaxeFoundation/haxe/blob/db842bfd6c815784f16da415d6718bcb7d393a66/std/sys/Http.hx#L444-L453

Edit: Seems like I can dodge the issue with some GC runs

Apprentice-Alchemist commented 5 months ago

Hmm. If I try haxe.Http.requestUrl("https://raw.githubusercontent.com/HaxeFoundation/haxe/development/tests/unit/res1.txt") it works, if I try with https://build.haxe.org/builds/haxe/linux64/haxe_latest.tar.gz I get a segfault.

As for that EBADF issue, the cause would be this function being called with a bad socket https://github.com/HaxeFoundation/haxe/blob/db842bfd6c815784f16da415d6718bcb7d393a66/src/macro/eval/evalSsl.ml#L72-L74 aka the issue would be somehwere here https://github.com/HaxeFoundation/haxe/blob/db842bfd6c815784f16da415d6718bcb7d393a66/libs/mbedtls/mbedtls_stubs.c#L462-L481 I don't know enough about Ocaml FFI to immediately see anything wrong with the bindings.

kLabz commented 5 months ago

Yes the file needs to be big enough to trigger that issue. The GC workaround makes me think we're allocating a bit too much there, and eval GC doesn't like while loops ? 🤔

Simn commented 5 months ago

I'm not very familiar with OCaml FFI either, but I always thought that only CAMLprim value kind of functions were supposed to have all this CAMLparam0 and such cruft. If that isn't the case then at the very least it looks suspicious that there's a param0 thing on a function that clearly has 3 parameters.

Apprentice-Alchemist commented 5 months ago

That param0 thing is needed for that function to be able to declare value locals, but since it doesn't take any value parameters it doesn't need to use CAMLparam3. https://ocaml.org/manual/5.1/intfc.html#sec503

Simn commented 5 months ago

Hmm, but it does have this vctx = (value)ctx; line which makes me wonder if ctx shouldn't be a value in the first place.

Apprentice-Alchemist commented 5 months ago

Thinking about it some more, the real problem is probably that ctx is stored somewhere the GC can't see.

Simn commented 5 months ago

Yes I think that's what I'm trying to get at as well.

Simn commented 5 months ago

Update please!

kLabz commented 5 months ago

Well.. locally it's worse than before (the workaround isn't working anymore), even if CI seems fine so far :thinking: There's something wrong with the mbedtls 3 impl I think, will investigate tomorrow.

Apprentice-Alchemist commented 5 months ago

Works fine on my machine, with or without the workaround. (Arch Linux, compiler linked with mbedtls 3)

kLabz commented 5 months ago

Works fine on my machine, with or without the workaround. (Arch Linux, compiler linked with mbedtls 3)

You do enable https tests by commenting out this line, right? https://github.com/HaxeFoundation/haxe/blob/6bda9cca3e56c5172e016511c8804f36730cc677/tests/unit/src/unit/TestHttps.hx#L7

I'm on Arch Linux too and I get this with or without the workaround:

src/unit/TestMainNow.hx:6: Generated at: 2024-04-30 11:49:31
src/unit/TestMain.hx:31: START
malloc(): invalid size (unsorted)
[3]    2210161 IOT instruction (core dumped)  haxe compile-macro.hxml

(unless I use nightlies which are not compiled with mbedtls 3, then it's fine)

Apprentice-Alchemist commented 5 months ago

Yes, I enabled https tests.

Does the core dump give a useful backtrace?

kLabz commented 5 months ago

Doesn't give anything :/ Can't repro my issue on CI, but also can't make it work locally unless I force mbedtls 2..

Also, a similar fix might be needed for mac+hl for these tests to pass

Apprentice-Alchemist commented 5 months ago

Regarding the HL+macOS failure, I don't think it's caused by GC issues or stuff like that.

I did a little bit of debugging using CI (https://github.com/Apprentice-Alchemist/hashlink/actions/runs/8907771743/job/24462210850) The error is caused by mbedtls failing in TLS 1.3 stuff: https://github.com/Mbed-TLS/mbedtls/blob/2ca6c285a0dd3f33982dd57299012dacab1ff206/library/ssl_tls13_generic.c#L1684-L1693

Not yet sure what the root cause is, but I suspect it might be due to some compile option getting enable by Homebrew somehow?

Apprentice-Alchemist commented 5 months ago

Ah, figured it out.

Starting in MbedTLS 3.6 MBEDTLS_SSL_PROTO_TLS1_3 is enabled by default. The TLS 1.3 code uses PSA Crypto, which requires psa_crypto_init to be called before using any mbedtls functions.

kLabz commented 5 months ago

Ah nice, thanks! Would be nice to have a draft PR with your branch so that it could be done at some point (by you or someone else), and I could disable the test here in the meantime.

As for my local issue, as long as it's only happening for me I guess it's ok... :/

Simn commented 2 months ago

I have merged #11655, please update the branch!

kLabz commented 2 months ago

I'm still getting malloc(): invalid size (unsorted) locally, which