HaxeFoundation / haxe

Haxe - The Cross-Platform Toolkit
https://haxe.org
6.03k stars 648 forks source link

[hl] use hl.NativeArray for Vector #11568

Open Simn opened 5 months ago

Simn commented 5 months ago

I really think that Vector should be implemented on top of fixed array types on all targets where this is possible. I'm attempting this here for HL, which seems to work well enough on HL/JIT, but fails on hlc:

Command: haxe [compile-hlc.hxml,-D,Windows]
Command exited with 1 in 6s: haxe [compile-hlc.hxml,-D,Windows]
test hl failed
 ERROR  (unknown position)

   | Error: Don't know how to compare array and array (hlc)

As @yuxiaomao points out, this has been reported before in #11468.

@ncannasse Any advice?

Simn commented 5 months ago

Another thing to address is that we're losing the sort function at the moment because the previous implementation just forwarded this to Array.sort. This means that we should finally look into #3388 first.

Simn commented 4 months ago

Thanks to @Apprentice-Alchemist for telling me how to fix the array compare problem! HL is now green, and I'd like to merge this soon so that we can use Vector in the public std API across all targets.

I don't know what to do with the sort problem though. It seems like we would need implementations of https://github.com/HaxeFoundation/haxe/blob/development/std/haxe/ds/ArraySort.hx for all Vector "variants" on HL, but even then it's not obvious how to define the actual sort function itself.

RblSb commented 4 months ago

I don’t know what exactly you’re talking about, but maybe it’s worth making the function inlineable so that inline vector.sort(...) can inline the callback (not sure how faster it can be on different targets)

Simn commented 4 months ago

Huh, I thought this didn't work on HL, but it seems like I'm wrong about that:

import haxe.ds.Vector;

function f<T>(x:Vector<T>) {
    trace(x);
}

function main() {
    var v = new Vector<Int>(1);
    f(v);
}

This makes me reluctant to merge this because apparently I don't understand how hl.NativeArray works.

Apprentice-Alchemist commented 4 months ago

It works, as long as you don't try to access the contents of the native array. The type being read is determined at the site of array access, so doing x[0] inside f would try to read a Dynamic instead of an Int, which can cause segfaults (https://try.haxe.org/#BE053756).

Simn commented 4 months ago

Right, so the situation is actually even worse than I thought. Surely this should be caught earlier...

Though to be fair, I also don't catch this on the JVM target and instead let it fail in the verifier with Exception in thread "main" java.lang.ClassCastException: class [I cannot be cast to class [Ljava.lang.Object;.

Simn commented 4 months ago

Your example gives me a proper error locally though:

Uncaught exception: Access violation
Called from _Main.$Main_Fields_.main(Main.hx:4)
Called from .init(?:1)

Do you also get that segfault locally with recent HL?

Apprentice-Alchemist commented 4 months ago

Access violation is the Windows equivalent of a segfault, looks like HL turns those into real exceptions somehow.

Apprentice-Alchemist commented 4 months ago

Testing locally on my Linux machine gives

SIGNAL 11
_Test2.$Test2_Fields_.main(Test2.hx:4)
.init(?:1)
fish: Job 1, 'hl out.hl' terminated by signal SIGSEGV (Address boundary error)

HL catches the segfault too but doesn't turn it into a real exception and just prints a stack trace before letting the default signal handler run.

Simn commented 4 months ago

Yeah okay, I can't merge this before this is improved somehow because we would be going from working code to a segfault, which isn't the best upgrading experience...

yuxiaomao commented 2 months ago

I don't think re-implement Vector directly with hl.NativeArray is a good idea:

https://github.com/HaxeFoundation/hashlink/blob/f463d71217c08f91c8b2d90f75c4d02afc9ff4f5/src/std/types.c#L43-L67

Simn commented 2 months ago

I'll tell you how I got here:

I'm aware of the problems you point out, but note that Vector is supposed to be completely transparent, and its type parameters are supposed to be invariant. The problem is that this cannot be expressed in our type system at the moment, so the only way to catch this is at the generator-level, where Vector already has special treatment on various targets anyway.

These variance violations have been failing natively since the old Flash times, and I'm fine with that, but it shouldn't fail with an incomprehensible segfault.

Apprentice-Alchemist commented 2 months ago

I think catching this in the generator would require inserting extra type checking during the typed expr -> bytecode phase in every place where there is a potential type "conversion".

This can't be done in some kind of bytecode verification phase either because the native array type does not carry type information at compile time.

Simn commented 2 months ago

This can't be done in some kind of bytecode verification phase either because the native array type does not carry type information at compile time.

Meh, I expected arrays to know their element type. In that case I don't know how to properly manage this either.

Apprentice-Alchemist commented 2 months ago

What about checking on array access/allocation instead, at that point the generator should be able to determine whether the element type is a type parameter. https://github.com/HaxeFoundation/haxe/blob/72175301c3dbd3eaa70f3479e6e46cb620242eef/src/generators/genhl.ml#L2001-L2031

The example with trace(x) would still work, but actually accessing the array would give a compile time error instead of segfaulting.

Hmm. That wouldn't fix the var a:hl.NativeArray<Dynamic> = new hl.NativeArray<Int>(1); case though.

ncannasse commented 2 months ago

The HL code to manage different types of Arrays is quite tricky.

So we shouldn't duplicate all of these for Vector, but it being an abstract with inline functions should work well ?

Simn commented 2 months ago

I agree it should work for normal use-cases. The problem is that currently assigning Vector<Int> to Vector<Dynamic> is just fine because HL's Vector is based on Array, and our type system doesn't care about this case. After this change it would instead segfault unceremoniously, which would be a terrible upgrading experience. That's why I'm looking into ways to make this fail nicer.