dschmenk / PLASMA

Proto Language AsSeMbler for All (formerly Apple)
MIT License
189 stars 26 forks source link

*@a and *(@a) give different results for byte-sized variables #49

Closed leanto closed 1 year ago

leanto commented 5 years ago

I was just experimenting with different addressing and I discovered that if a is declared as a byte, *@a and *(@a) give different results:

include "inc/cmdsys.plh"

byte a
byte b

a = 5
b = 6
puti(*@a) // 5
putln()
puti(*(@a)) // 1541
putln()

Specifically, it appears that *@a remembers that a is a byte while *(@a) does not.

Is this intended behavior?

leanto commented 5 years ago

Following up on this, I tried the same test with *@x when x is declared as a word:

include "inc/cmdsys.plh"

word x

x = $5566
puti(^@x) // 85
putln()
puti(^(@x)) // 85
putln()
puti(*@x) // 21845

Here, ^@x does not "remember" that x is a word, so I think the behavior with byte values is a bug.

dschmenk commented 5 years ago

Yep, it looks like types don't get tracked properly through the expression. The '*' operator should have precedence and return a word. I'll into where the '@' kept the byte type when it shouldn't have.

Thanks for the great sample code, too!

Dave...

iflan commented 5 years ago

BTW, I tested this in AppleWin and it's in the PLASMA version of the compiler, too:

image

iflan commented 5 years ago

For the fun of it, I tried to figure out what the problem is. Here's what I managed to figure out:

After puti gets parsed as a function call, parse_list gets invoked to parse all of the arguments. This in turn calls parse_expr which calls parse_value. parse_value is where things get interesting.

The first time through the while (scan()) {...} loop, scantoken == WPTR_TOKEN, causing the if (scantoken == BPTR_TOKEN || scantoken == WPTR_TOKEN) block to run. Here type = WPTR_TYPE and deref = 2. Then the loop starts again.

The second time through the while (scan()) {...} loop, scantoken == AT_TOKEN and deref = 1.

The third time through the while (scan()) {...} loop, scantoken == ID_TOKEN. In that block, tokenstr = "a) // 5" and tokenlen = 1, meaning that the token is "a", as expected. The result of id_type(tokenstr, tokenlen) = BYTE_TYPE, giving type = 260.

This causes flow to fall into the else block where value = 27 = CONST_TYPE | WORD_TYPE | ASM_TYPE | DEF_TYPE. This causes valseq to be set a t_opseq that will load the global address of a. This is fine.

The fourth time through the while (scan()) {...} loop, scantoken == CLOSE_PAREN_TOKEN, causing the loop to exit. The while (deref > rvalue) {...} loop is skipped. Then flow enters if (deref) {...}.

At this point we have this block wherein the problem lies:

        if (type & FUNC_TYPE)
        { ... }
        else if (type & (BYTE_TYPE | BPTR_TYPE))
            valseq = gen_lb(valseq);
        else if (type & (WORD_TYPE | WPTR_TYPE))
            valseq = gen_lw(valseq);
        else
            parse_error("What are we dereferencing?");

Because type = BYTE_TYPE | WPTR_TYPE, this causes the LB instruction to be generated instead of LW. Were the if clauses inverted, it would generate LW correctly, but ^@word_var would also generate LW, which is also wrong.

Perhaps the interplay of the WORD_TYPE and BYTE_TYPE with WPTR_TYPE and BPTR_TYPE can be fixed here. If I understand correctly, the WORD_TYPE and BYTE_TYPE can only come from the symbol that's being operated on, but the WPTR_TYPE and BPTR_TYPE are the type of dereference that's actually happening. So maybe something like:

        else if (type & (BPTR_TYPE))
            valseq = gen_lb(valseq);
        else if (type & (WPTR_TYPE))
            valseq = gen_lw(valseq);
        else if (type & (BYTE_TYPE)
            valseq = gen_lb(valseq);
        else if (type & (WORD_TYPE))
            valseq = gen_lw(valseq);

would work because it gives preference to the pointer type.

However, maybe this is the wrong approach because, in actually tracing through, it seems like *@a == @*a which seems even more wrong. And this test shows that it's so:

include "inc/cmdsys.plh"

byte a
byte b

a = 5
b = 6
puti(*@a) // 5
putln()
puti(*(@a)) // 1541
putln()
puti(@*a) // 5
putln()
done
dschmenk commented 5 years ago

Perhaps the interplay of the WORD_TYPE and BYTE_TYPE with WPTR_TYPE and BPTR_TYPE can be fixed here. If I understand correctly, the WORD_TYPE and BYTE_TYPE can only come from the symbol that's being operated on, but the WPTR_TYPE and BPTR_TYPE are the type of dereference that's actually happening. So maybe something like:

    else if (type & (BPTR_TYPE))
        valseq = gen_lb(valseq);
    else if (type & (WPTR_TYPE))
        valseq = gen_lw(valseq);
    else if (type & (BYTE_TYPE)
        valseq = gen_lb(valseq);
    else if (type & (WORD_TYPE))
        valseq = gen_lw(valseq);

would work because it gives preference to the pointer type.

This is probably the easiest approach without rewriting the prefix operator handling. I've gone back and forth on this over the years and keep finding situations where it breaks down.

However, maybe this is the wrong approach because, in actually tracing through, it seems like @a == @a which seems even more wrong. And this test shows that it's so:

include "inc/cmdsys.plh"

byte a
byte b

a = 5
b = 6
puti(*@a) // 5
putln()
puti(*(@a)) // 1541
putln()
puti(@*a) // 5
putln()
done

Note that @*a is non-sensical and should throw an error (but currently doesn't). You can only take the address of a variable/function, not an expression or intermediate value.

iflan commented 5 years ago

@dschmenk, I think this can be closed. (Unless you want to use this for fixing the prefix operator parsing, that is.)

dschmenk commented 5 years ago

I should at least add warnings/errors for non-sensical order of pre-ops.