HaxeFoundation / haxe

Haxe - The Cross-Platform Toolkit
https://haxe.org
6.18k stars 655 forks source link

Heinsenbug #4347

Closed ncannasse closed 9 years ago

ncannasse commented 9 years ago

Just currently tracking an hardcore bug with our Evoland2 code base and latest Haxe compiler related to monomorphs. I don't really get what's going on so I'll log my findings here, hoping it rings some bells wrt to some recent change - it might actually not be related to a recent change at all given how hard it seems to reproduce.

The issue is the following: all my unserialized data gets turned to String values.

The reason: in haxe.Unserializer, we have the following code:

    function unserializeObject(o) {
        while( true ) {
            if( pos >= length )
                throw "Invalid object";
            if( get(pos) == "g".code )
                break;
            var k = unserialize();
            if( !Std.is(k,String) )
                throw "Invalid object key";
            var v = unserialize();
            Reflect.setField(o,k,v);
        }
        pos++;
    }

In that case v is a monomorph, so it should be generated as Dynamic by genSWF, and it was actually generated that way... up to now.

In the SWF output, I'm getting the String type instead of Dynamic. More funny is that if I $type(v) before and after the setField, it correctly prints Unknown but will anyway compiles to String.

Of course, trying to reproduce with a small example does not actually reproduces the bug (how funny is that !).

Any insight ?

Simn commented 9 years ago

What does -D dump say about v's type?

ncannasse commented 9 years ago

haven't checked but it most likely says String. I found the only place where the TMono is actually bound, here's the stack trace:

Raised at file "type.ml", line 1239, characters 34-38
Called from file "type.ml", line 1370, characters 41-53
Called from file "type.ml", line 1448, characters 2-22
Called from file "codegen.ml", line 1806, characters 10-31
Called from file "codegen.ml", line 1813, characters 13-26
Called from file "list.ml", line 57, characters 20-23
Called from file "type.ml", line 2010, characters 26-41
Called from file "codegen.ml", line 1864, characters 10-36
Called from file "list.ml", line 57, characters 20-23
Called from file "list.ml", line 57, characters 32-39
Called from file "list.ml", line 57, characters 32-39
Called from file "list.ml", line 57, characters 32-39
Called from file "list.ml", line 57, characters 32-39
Called from file "list.ml", line 57, characters 32-39
Called from file "list.ml", line 57, characters 32-39
Called from file "type.ml", line 2010, characters 26-41
Called from file "codegen.ml", line 1864, characters 10-36
Called from file "type.ml", line 1994, characters 30-34
Called from file "codegen.ml", line 1864, characters 10-36
Called from file "list.ml", line 57, characters 20-23
Called from file "type.ml", line 2010, characters 26-41
Called from file "codegen.ml", line 1864, characters 10-36
Called from file "codegen.ml", line 1860, characters 59-75
Called from file "list.ml", line 86, characters 24-34
Called from file "filters.ml", line 1107, characters 22-29
Called from file "list.ml", line 75, characters 12-15
Called from file "filters.ml", line 1112, characters 2-45
Called from file "list.ml", line 75, characters 12-15
Called from file "filters.ml", line 1209, characters 2-59
Called from file "main.ml", line 1508, characters 2-27
Called from file "main.ml", line 637, characters 3-11
Called from file "main.ml", line 1741, characters 1-35
sledorze commented 9 years ago

bisect?

ncannasse commented 9 years ago

For the record that was funny to track : I used $type(v) to be able to store my TMono reference into a variable, then raise an error when we actually try to bind it in Type.link

Simn commented 9 years ago

type.ml line 1370 is (match !t with on development branch so the line numbers are likely not accurate.

ncannasse commented 9 years ago

@Simn yes that's because I added a few lines to type.ml to get my stack. The error actually lie in codegen which should not unify any TMono I guess ?

Simn commented 9 years ago

I'm not sure... I still don't really see where that String unification comes from. The stacktrace suggests that the unification comes from an assignment (TBinop, not TVar).

ncannasse commented 9 years ago

No much luck with bisect, I'm getting unrelated errors while bisecting, it seems there were several times the development version was broken wrt compiling our big code base :)

Simn commented 9 years ago

You could try commenting out https://github.com/HaxeFoundation/haxe/blob/development/filters.ml#L1186, then compile with -D dump and grep for the Unknown<X> type that v should have. That should tell you where else the type escaped to.

ncannasse commented 9 years ago

@Simn it uses type_iseq which uses type_eq, which might trigger some monomorph linking

Simn commented 9 years ago

I know that, but I don't see why the type of your v should end up in any place where that would matter.

ncannasse commented 9 years ago

that's another question :) but I'm quite sure it does given the stack trace. Confirming with dump ATM

ncannasse commented 9 years ago

Uhm, seems liek commenting out the filter still trigger the bug, will look into it now

    unserializeObject(method) : o : { } -> Void

     = [Function:o : { } -> Void]
        [Arg:{ }] [Local o(14162):{ }]
        [Block:Int]
            [While:Void]
                [Parenthesis:Bool] [Const true:Bool]
                [Block:Void]
                    [If:Void]
                        [Parenthesis:Bool]
                            [Binop:Bool]
                                [Field:Int]
                                    [Const this:haxe.Unserializer]
                                    [FInstance:Int]
                                        haxe.Unserializer
                                        pos
                                >=
                                [Field:Int]
                                    [Const this:haxe.Unserializer]
                                    [FInstance:Int]
                                        haxe.Unserializer
                                        length
                        [Then:Unknown<89>] [Throw:Unknown<89>] [Const "Invalid object":String]
                    [If:Void]
                        [Parenthesis:Bool]
                            [Binop:Bool]
                                [Call:Int]
                                    [Field:Int -> Unknown<88>]
                                        [Field:String]
                                            [Const this:haxe.Unserializer]
                                            [FInstance:String]
                                                haxe.Unserializer
                                                buf
                                        [FDynamic:Int -> Unknown<88>] cca
                                    [Field:Int]
                                        [Const this:haxe.Unserializer]
                                        [FInstance:Int]
                                            haxe.Unserializer
                                            pos
                                ==
                                [Const 103:Int]
                        [Then:Dynamic] [Break:Dynamic]
                    [Var k(14166):String]
                        [Call:Dynamic]
                            [Field:Void -> Dynamic]
                                [Const this:haxe.Unserializer]
                                [FInstance:Void -> Dynamic]
                                    haxe.Unserializer
                                    unserialize
                    [If:Void]
                        [Parenthesis:Bool]
                            [Unop:Bool]
                                !
                                Prefix
                                [Call:Bool]
                                    [Field:v : Dynamic -> t : Dynamic -> Bool]
                                        [TypeExpr Std:Class<Std>]
                                        [FStatic:v : Dynamic -> t : Dynamic -> Bool]
                                            Std
                                            is
                                    [Local k(14166):String]
                                    [TypeExpr String:Class<String>]
                        [Then:Unknown<90>] [Throw:Unknown<90>] [Const "Invalid object key":String]
                    [Var v(14173):String]
                        [Call:Dynamic]
                            [Field:Void -> Dynamic]
                                [Const this:haxe.Unserializer]
                                [FInstance:Void -> Dynamic]
                                    haxe.Unserializer
                                    unserialize
                    [Binop:String]
                        [Array:String]
                            [Local o(14162):{ }]
                            [Local k(14166):String]
                        =
                        [Local v(14173):String]
            [Unop:Int]
                ++
                Postfix
                [Field:Int]
                    [Const this:haxe.Unserializer]
                    [FInstance:Int]
                        haxe.Unserializer
                        pos
ncannasse commented 9 years ago

Uhm, retrying, I only commented out the filter when analyzer was run

ncannasse commented 9 years ago

It indeed came from the filter (here's the dump with filter commented out)

    unserializeObject(method) : o : { } -> Void

     = [Function:o : { } -> Void]
        [Arg:{ }] [Local o(14162):{ }]
        [Block:Int]
            [While:Void]
                [Parenthesis:Bool] [Const true:Bool]
                [Block:Void]
                    [If:Void]
                        [Parenthesis:Bool]
                            [Binop:Bool]
                                [Field:Int]
                                    [Const this:haxe.Unserializer]
                                    [FInstance:Int]
                                        haxe.Unserializer
                                        pos
                                >=
                                [Field:Int]
                                    [Const this:haxe.Unserializer]
                                    [FInstance:Int]
                                        haxe.Unserializer
                                        length
                        [Then:Unknown<90>] [Throw:Unknown<90>] [Const "Invalid object":String]
                    [If:Void]
                        [Parenthesis:Bool]
                            [Binop:Bool]
                                [Call:Int]
                                    [Field:Int -> Unknown<89>]
                                        [Field:String]
                                            [Const this:haxe.Unserializer]
                                            [FInstance:String]
                                                haxe.Unserializer
                                                buf
                                        [FDynamic:Int -> Unknown<89>] cca
                                    [Field:Int]
                                        [Const this:haxe.Unserializer]
                                        [FInstance:Int]
                                            haxe.Unserializer
                                            pos
                                ==
                                [Const 103:Int]
                        [Then:Dynamic] [Break:Dynamic]
                    [Var k(14166):String]
                        [Call:Dynamic]
                            [Field:Void -> Dynamic]
                                [Const this:haxe.Unserializer]
                                [FInstance:Void -> Dynamic]
                                    haxe.Unserializer
                                    unserialize
                    [If:Void]
                        [Parenthesis:Bool]
                            [Unop:Bool]
                                !
                                Prefix
                                [Call:Bool]
                                    [Field:v : Dynamic -> t : Dynamic -> Bool]
                                        [TypeExpr Std:Class<Std>]
                                        [FStatic:v : Dynamic -> t : Dynamic -> Bool]
                                            Std
                                            is
                                    [Local k(14166):String]
                                    [TypeExpr String:Class<String>]
                        [Then:Unknown<91>] [Throw:Unknown<91>] [Const "Invalid object key":String]
                    [Var v(14173):Unknown<92>]
                        [Call:Dynamic]
                            [Field:Void -> Dynamic]
                                [Const this:haxe.Unserializer]
                                [FInstance:Void -> Dynamic]
                                    haxe.Unserializer
                                    unserialize
                    [Local v(14173):Unknown<92>]
                    [Binop:Unknown<55>]
                        [Array:Unknown<55>]
                            [Local o(14162):{ }]
                            [Local k(14166):String]
                        =
                        [Local v(14173):Unknown<92>]
            [Unop:Int]
                ++
                Postfix
                [Field:Int]
                    [Const this:haxe.Unserializer]
                    [FInstance:Int]
                        haxe.Unserializer
                        pos
Simn commented 9 years ago

Do you have Unknown<92> or Unknown<55> anywhere else in the dump files?

ncannasse commented 9 years ago

I'll make a full compare of the dump files with and without the filter to see what's affected

ncannasse commented 9 years ago

Here's the full (working) Unserializer http://pastebin.com/YaZ7z8QC

ncannasse commented 9 years ago

No Unknown<92> apart from these, but I'm getting a lot of Unknown<55> at all the places Reflect.field is inlined actually

Simn commented 9 years ago

You're right, this does the trick:

class Main {
    static function main() {
        var v = getDynamic();
        var o = {};
        Reflect.setField(o, "f", v);
        var s:String = "";
        Reflect.setField(o, "f", s);
    }

    static function getDynamic():Dynamic {
        return 1;
    }
}
ncannasse commented 9 years ago

heinseinbug spotted \o/

Simn commented 9 years ago

We can disable the filter, it's not really used at the moment anyway. Still, inline-monomorphs like this could be bad in other places as well. We make copies of the tvars but we don't copy any TMonos that are in a texpr.etype.

ncannasse commented 9 years ago

As an user you usually have to be careful about monomorphs in either parameters or return types, the monorphs in the function body should not be modified. I agree it's dangerous but I don't see any solution apart from us being careful not unifying these, since they are at the base of type inference.

ncannasse commented 9 years ago

We could copy them when inlining but that would eventually make the unification issues even harder to spot

Simn commented 9 years ago

We could add a filter that walks the AST and binds all monomorphs to Dynamic. I think that would be compilation-server-safe too because if we still have a monomorph it was never unified with anything other than Dynamic, so there's no reason why it should be unified during the next iteration, right?

ncannasse commented 9 years ago

@Simn that would work as a first filter, we still need to be careful not to have such bindings occurring during the typing phase

ncannasse commented 9 years ago

@Simn actually, now you mention it... this might create issues with compilation server

ncannasse commented 9 years ago

if you cache a module with unbound monomorph parameter, you don't want it to turn into Dynamic or else all usages of it will silently fail, that would introduce some difference between using compilation server or not

ncannasse commented 9 years ago

I've disabled the filter for now. Tell me if there's more of these unused filters, I'm sure we can speedup the compiler this way :)

Simn commented 9 years ago

It was active until very recently because it dealt with the cs.Ref and cs.Out types, which require special treatment in the analyzer due to their semantics.

By the way, Java/C# do quite a bit of unification in gencommon.ml as well. I'm sure the problem could be reproduced there if someone tried. @waneck

waneck commented 9 years ago

I always map unfollowed TMono's to Dynamic before doing unification, IIRC

waneck commented 9 years ago

And BTW, I agree that TMonos should be bound to TDynamic. But this is mostly related with Haxe spilling out unbound TMonos outside a function context (e.g. returning a TMono). IMHO, this should be bound to TDynamic as early as the function has finished typing

Simn commented 9 years ago

I would love to have a test for this but that's not going to be trivial. I guess I have to write a parser for our AST dump (and then write an interpreter for it to replace neko (and cppia)). :P

ncannasse commented 9 years ago

@waneck that was what I was thinking at first when designing Haxe, but then suddenly a lot of code turned Dynamic just because of some missing typing.

For instance let's say you have function getFirstUser( users ) return users[0];

You definitely don't want it to be typed as Array<Dynamic>

Simn commented 9 years ago

I agree, but us common people prefer type-hinting our function arguments anyway. ;)

ncannasse commented 9 years ago

@Simn Thou shalt pray to type inference god