Open Simn opened 8 years ago
I have studied quite in detail the problem of variance with Arrays, and the particular case of Array<Dynamic>
, as part of my work on HL.
I came up with the following design that I would like to be generalized on all "static" targets that have type-specific arrays (usually for basic types such as Array<Float>
), so that would concern C++,C#,Java, and HL.
I define several Array classes implementations:
ArrayBasic<Float>
, ArrayBasic<Int>
which uses @:generic
, then ArrayObj which contains any kind of "pointer" object. They implement both the "classic" Haxe Array API and some extra *Dyn()
functions that comes from ArrayBaseArray<Dynamic>
implementation, which operates on an underlying ArrayBase.The trick is the following:
When creating a new ArrayDyn, we have an extra flag in it which is named allowReinterpret
, which means that when we cast such ArrayDyn to an Array<Int>
for instance, we will first reinterpret the underlying ArrayObj to an ArrayBasic<Int>
before returning it, then set allowReinterpret to false, which means that after the first cast to a known type, the Array type becomes fixed, which prevent copying the whole array on each of such casts.
I think this cover all the problematic cases (JSON, Serialization, etc.)
This requires two kind of runtime support/emulation:
__cast
method__get_field
in HL. That's the reason length is not in ArrayAccess.The full source code for HL solution is available here: https://github.com/HaxeFoundation/haxe/tree/hl/std/hl/types (ArrayBase, ArrayObj, ArrayDyn)
I would like to hear about @hughsando and @waneck comment on this, then if it's fine with both of you, I can write unit tests for it so we make sure that this behavior is correctly standardized on all static platforms.
PS: one last thing I haven't yet worked on is referential equality : we also want our ArrayDyn == its underlying ArrayBase.
I have implemented a similar thing with "VirtualArray" The array starts off unknown, and then as you add stuff to it, the underlying representation changes until you ultimately cast it, and then it becomes fixed. So if you add a bunch of ints, it is an int array. If you then add a float, it becomes a float array. If you add a null, it becomes a Dynamic array. This should help by avoiding boxing in a lot of cases. You can try this in hxcpp target by swapping the commenting of these likes: "cpp::ArrayBase" ^ suffix (* "cpp::VirtualArray" ^ suffix *) (side note - can have a global variable to switch this in ocaml, or do I need to pass a context everywhere?)
It seems to work ok, but I have not got it working in cppia.
On Tue, Feb 23, 2016 at 7:59 PM, Nicolas Cannasse notifications@github.com wrote:
PS: one last thing I haven't yet worked on is referential equality : we also want our ArrayDyn == its underlying Array.
— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-187671417 .
@hughsando the problem with this approach is that your array keep changing its underlying type, for instance if you add two floats, then cast to Array<Int>
, then add floats again, what is the behavior ? My proposal is that the array is either in an "unkown state" (dynamic) or in a known state, and once it is it cannot change anymore. And since the first cast to an not-Array-Dynamic will make it a known state, it guarantee that you never happen to have several copies of the underlying array which does not preserve side effects.
The casting only ever happens in one direction: empty -> Int -> Float ->
Dynamic (maybe skip Float or Int), or empty -> String -> Dynamic or empty
-> Bool -> Dynamic. or just empty -> Dynamic.
As soon as you cast, it locks it in forever (like your solution). The good
thing about this is that the final cast (eg, to Array
On Wed, Feb 24, 2016 at 4:01 AM, Nicolas Cannasse notifications@github.com wrote:
@hughsando https://github.com/hughsando the problem with this approach is that your array keep changing its underlying type, for instance if you add two floats, then cast to Array
, then add floats again, what is the behavior ? My proposal is that the array is either in an "unkown state" (dynamic) or in a known state, and once it is it cannot change anymore. And since the first cast to an not-Array-Dynamic will make it a known state, it guarantee that you never happen to have several copies of the underlying array which does not preserve side effects. — Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-187878494 .
Ok, that seems good to me, I could indeed optimize HL this way as well, it's not something really detectable by the enduser anyway.
Any reason it has not been actived by default in hxcpp yet ?
Also, still waiting to hear from @waneck :)
Testing and cppia mainly - I wrote it, but did not have time to put out fires if it went wrong. Now I've moved over to "haxe mode", I will try to git it in in the next week or two.
Hugh
On Wed, Feb 24, 2016 at 8:49 PM, Nicolas Cannasse notifications@github.com wrote:
Ok, that seems good to me, I could indeed optimize HL this way as well, it's not something really detectable by the enduser anyway.
Any reason it has not been actived by default in hxcpp yet ?
Also, still waiting to hear from @waneck https://github.com/waneck :)
— Reply to this email directly or view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-188240459 .
I just wanted to start adding tests for this but didn't get very far:
class Main {
static function main() {
var a:Array<Dynamic> = [1, 2, 3];
trace("I'm just here to prevent the analyzer fusion");
var b:Array<Int> = cast a;
trace("Yeah me too");
trace(a == b); // false
}
}
@hughsando: Wasn't this supposed to work now?
Looks like it is an issue with "op==". If you do: b.push(4); trace(a); you get 1,2,3,4
I added the operators to hxcpp.
That helped with the direct ==, but it still fails if we hide the arrays behind type parameters:
class Main {
static function main() {
var a:Array<Dynamic> = [1, 2, 3];
trace("I'm just here to prevent the analyzer fusion");
var b:Array<Int> = cast a;
trace("Yeah me too");
eq(a, b);
}
static function eq<T>(e1:T, e2:T) trace(e1 == e2); // false
}
P.S @ncannasse: That fails on HL too, even in the interpreted version.
There's not much of a point in me adding a test that fails on multiple targets. I'll assign this issue to the 3.3.0 final milestone so we can see what we want to do here after the RC.
@Simn which targets apart hxcpp are concerned ? Java and C# I presume ?
@hughsando could you look into it for RC? I think this kind of non-equality could break people code when trying RC.
I think hxcpp is fine now actually, so it's probably Java, C# and HL.
Please submit the unit tests with #if !(hl || java || cs) for the parts that fail for now, and open a separate issue for these.
I don't like these #if tests, it's too easy to forget about them. I've added it without the conditional compilation to a separate branch so we can forget about that instead.
As expected the test fails on C# and HL: https://travis-ci.org/Simn/haxe/builds/120320083
I'm assigning to @waneck because HL is not ready for 3.3 anyway
I won't be able to get this done for 3.3. Pushing to 3.4
Moved from https://github.com/HaxeFoundation/haxe/issues/5467 .
It looks like C# implementation copying all fields of original Array. (Using reflection for some reason). Doesn't that cause more issues? That's slow, and newly created array no longer refers to the same array. I don't want Haxe to copy Array in that case.
Yes, this is #2103.
Is reflection always needed to achieve this? I don't know where reflection calls came from. Are they automatically generated by Haxe?
I think hxcpp is fine now actually, so it's probably Java, C# and HL.
@Simn Both of the test cases you posted here still trace false
for me on hxcpp (Haxe 3.4.0 and hxcpp 3.4.49). Or was it fixed at some point and then broken again?
I have checked hxcpp again and both tests work there now. I also made sure that the analyzer doesn't inadvertently fix it.
I didn't test HL because all I ever get out of that these days is "Unsupported bytecode version". It's definitely still broken on Java and C# though.
I confirmed HL to still be failing here as well.
I have fixed HL support for this specific test in be1a102a910ec8446dce8c325dc23aeb9a768033
This might break some runtime wrt Array
@RealyUniqueName Could you look at the PHP failure?
class Main {
@:analyzer(ignore)
static public function main() {
var a:Array<Dynamic> = [1, 2, 3];
var b:Array<Int> = cast a;
b.push(4);
trace(a[3]); // null
}
}
Note that the @:analyzer(ignore)
is required to reproduce this.
Java shouldn't have any issue like that.
As for C#, my real issue with implementing this is how should we deal with user-created type parameters? The possible solutions I can think are:
erase_generics
the default)I can't think of a way to generalize the Array handling, but I'd be interested to hear ideas. As for between 1 and 2, I'm also not sure what's best. Does anyone have strong opinions regarding them?
What problems does erase_generics
cause?
I think it's mostly about performance and generated code compatibility with native C# code
Arrays are special - in particular Array
On Wed, Jun 6, 2018 at 1:36 AM, Cauê Waneck notifications@github.com wrote:
Java shouldn't have any issue like that.
As for C#, my real issue with implementing this is how should we deal with user-created type parameters? The possible solutions I can think are:
- We don't generate type parameters for hxGen C# types (e.g. make erase_generics the default)
- We keep our old behavior, so that Array is a special case
- We try to generalize the way we handle Array so it works for any generic class
I can't think of a way to generalize the Array handling, but I'd be interested to hear ideas. As for between 1 and 2, I'm also not sure what's best. Does anyone have strong opinions regarding them?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-394796613, or mute the thread https://github.com/notifications/unsubscribe-auth/ABlp1lM3ircLKQDruzIvat-8RN-xTQjzks5t5sG0gaJpZM4Hgg8I .
@RealyUniqueName Could you look at the PHP failure?
That's because Array.push()
is inlined in std of php backend.
That push is generated as:
$b = $a;
$this1 = $b->arr;
$idx = $b->length;
$this1[$idx] = 4;
++$b->length;
And $b->arr
is a native PHP array. And native arrays are passed by copy, not by reference. I think I can fix this in std without removing inline
.
But this is not ok to automatically introduce temp vars for accessing object fields. I guess it also affects native structs on C# target.
I've updated the branch again:
src/unit/TestArray.hx:12: 4 should be null
src/unit/TestArray.hx:9: characters 5-11 : Don't know how to compare #hl.types.ArrayDyn and #hl.types.ArrayBytes_Int
src/unit/TestArray.hx:9: false should be true src/unit/TestArray.hx:10: [1,2,3] should be [1,2,3,4] src/unit/TestArray.hx:11: 4 should be 3 src/unit/TestArray.hx:12: 4 should be 0
Can you please point me to the branch? Can't find it )
Sure, I keep these branches on my own GitHub because that tends to be better for CI: https://github.com/Simn/haxe/tree/array_variance_tests
PHP fails because of the same problem described in my previous comment: https://github.com/HaxeFoundation/haxe/issues/4872#issuecomment-394954238
Yes but is that for me or for you to fix? If it's a problem with temp vars we have to check if it can be reproduced manually (without inline).
I think it's for you to fix. Because those temp vars will case the same problem for any field of non-scalar value-type.
Then we have to mark the affected types with some metadata. Though I don't really like having to support copy-semantics just for this.
Note that this example also doesn't behave like the syntax would suggest:
class Main {
@:analyzer(ignore)
static public function main() {
var a = [1, 2, 3];
var b = @:privateAccess a.arr;
b[3] = 4;
trace(a[3]); // null
}
}
Note that this example also doesn't behave like the syntax would suggest
This is a defined behavior for php.NativeArray
so it's up to the user to decide what to do with it.
On the other hand inst.inlinedMethod()
temp vars generation is out of the user's control, so it should behave the same way with and without inlining.
This would mean that for certain types we never create a temp var. This is quite problematic for various cases I can think of.
IMO this has to be approached differently, e.g. by not using a field access expression on the native array type.
Moved to 4.1 because it's a rather involved project.
Any motion on this for the exciting 4.1 release? I'm hoping https://github.com/HaxeFoundation/haxe/issues/8314 gets cleared.
This won't be resolved for 4.1. Which target are you concerned about?
C# We started using haxe 4 in preview 3. We liked the syntax so we started incorporating it as we weren't going to release for a year. We figured nothing major would change. Now our Server side (C#) is compiling out of rc2 and our client side (JS) is compiling out of 4.0.5
@kLabz What do you think about this? Any workaround you can recommend for #8314? I really don't want people to be stuck on RC versions.
I don't think there's much that can be done except fix this issue.
The same kind of functionalities can be achieved, using c# utils (so not in a cross-target way), and without Array<T>
(so it requires some rewrite):
import cs.system.collections.generic.List_1;
import newtonsoft.json.JsonConvert;
@:publicFields
class DataItem {
var data:String;
var strArray:List_1<String>;
var intArray:List_1<Int>;
var dynArray:List_1<Dynamic>;
var anyArray:List_1<Any>;
}
class Main {
public static function main():Void {
var t:DataItem = JsonConvert.DeserializeObject(
haxe.Json.stringify({data:"d", strArray:[], intArray: [], dynArray: [], anyArray: []})
);
var strArray = t.strArray;
strArray.Add("via reference");
t.strArray.Add("via getField");
trace(strArray.Count); // 2
trace(t.strArray.Count); // 2
var strArray = t.strArray;
strArray.Add("via reference");
t.strArray.Add("via getField");
trace(strArray.Count); // 4
trace(t.strArray.Count); // 4
var intArray = t.intArray;
intArray.Add(0);
t.intArray.Add(1);
trace(intArray.Count); // 2
trace(t.intArray.Count); // 2
var dynArray = t.dynArray;
dynArray.Add(0);
dynArray.Add("test");
trace(dynArray.Count); // 2
trace(t.dynArray.Count); // 2
var anyArray = t.anyArray;
anyArray.Add(0);
anyArray.Add("test");
trace(anyArray.Count); // 2
trace(t.anyArray.Count); // 2
}
}
To avoid fragmentation I'm consolidating multiple issues here that likely have the same cause and solution:
2103
2579
3429
4187
4340
4424
4783
Edit NC, added
5799