hlorenzi / customasm

💻 An assembler for custom, user-defined instruction sets! https://hlorenzi.github.io/customasm/web/
Apache License 2.0
719 stars 56 forks source link

Error "Instruction size did not converge after iterations" [v0.11.1] #47

Closed ProxyPlayerHD closed 3 years ago

ProxyPlayerHD commented 4 years ago

sadly the new update isn't perfect... (the code assembles perfectly with v0.10.6) here the complete output:

error: instruction size did not converge after iterations
 --> Testing2.R816:12:3:
 10 |
 11 |   .LOOP:
 12 |     LD R0, (X)
    |     ^^^^^^^^^^
 13 |     LD (SCRN_START, X), R0
 14 |     INC XL

here the CPU file: https://pastebin.com/RUUadTQK and the code i tried to assemble: https://pastebin.com/uakEHJjY

pol-rivero commented 4 years ago

This is really weird, changing this line LD {Re:REGS}, ({ADDR}) => 0b0000 @ Re[3:0] @ 0b0000 @ 1[3:0] @ ADDR[15:0] to LD {Re:REGS}, ({ADDR}) => { assert(true), 0b0000 @ Re[3:0] @ 0b0000 @ 1[3:0] @ ADDR[15:0] } on your ruledef seems to fix the issue.

Edit: I was wrong, the error is no longer present but the output isn't correct. Changing the order of the lines (putting LD {Re:REGS}, (X) above LD {Re:REGS}, ({Addr})) seems to work in this case.

ProxyPlayerHD commented 4 years ago

thanks for the help. it's still weird that it's happening at all...

moonheart08 commented 4 years ago

I feel like this might have something to do with the ()

ProxyPlayerHD commented 4 years ago

but why? as said it works perfectly fine in previous versions.

moonheart08 commented 4 years ago

Ok, correction, I see why. {Addr} can match labels as well, and you could name a label X. So (X) can conflict. Odd error though

ProxyPlayerHD commented 4 years ago

as far as i understand it though the assembler should prioritize direct matches over labels.

it should be impossible to use X as a label in LD R0, (X) for example. (X) should automatically match that instruction that has (X) and not the ({addr}) one.

moonheart08 commented 4 years ago

The assembler prioritizes in whatever order the matches are specified. If you specify (X) first, it'll prioritize it.

hlorenzi commented 4 years ago

Thanks for reporting!! I've come up with a minimal test case:

#ruledef
{
    LD {ADDR} => 0x0 @ ADDR`16
    LD X      => 0x1
}

LD X

The error is gone if you either (1) swap the declarations around, or (2) make both declarations output the same number of bits. Since (2) is also possible, I think there might be a bug in the code, or at least some inconsistency I did not think about. Also, the error message is misleading... I'll take a look at it soon.

But usually, I think it should be a good rule of thumb to declare your fixed patterns before the variable patterns, like moonheart08 said. I'll remember to make this fact clearer in the docs!

ProxyPlayerHD commented 4 years ago

alright i have to keep that in mind. the only reason the ({ADDR}) instruction comes before the (X) Instruction is because that is how it's done in the actual hardware, look at the upper 4 bits of both instructions to see what i mean...

ProxyPlayerHD commented 3 years ago

can i piggyback another "Instruction Matching" issue on this one? This is basically the same as the one we already had, but just more confusing.

i got this small piece of code that has 2 different instructions:

LD (0xBEEF), R0
LD (0x6969, R1), R0

and this is the snipet of the CPU file that handles both of them:

LD ({ADDR}), {Ra:REGS}          => 0b1000 @ Ra`4 @ 0b0000 @ 1`4 @ ADDR`16
LD ({ADDR}, {Rb:REGS}), {Ra:REGS}   => 0b1011 @ Ra`4 @ Rb`4 @ 1`4 @ ADDR`16

and it should assemble into this: 0x8001 0xBEEF 0xB011 0x6969 but the actual output is this: 0x8001 0xBEEF 0x8001 0x6969

meaning that for some reason the Assembler completely ignores the , R1 in the second instruction and treats it exactly like the first one (just with a differnt address). switching both instruction definitions fixes the issue, but it still confuses me because why would the Assembler just ignore a comma when matching instructions? i specifically use commas for offset instructions because i know it would ignore arithmetic symbols like +, -, etc.

I also have this:

LD ({Rb:REGS}, {ADDR}), {Ra:REGS}           => 0b1011 @ Ra`4 @ Rb`4 @ 1`4 @ ADDR`16

which is an alternative way of writing the second instruction shown, where the Register is written first and then the Address... and it somehow works perfectly fine without having to move around definitions.

LD (0x6969, R1), R0     ; Doesn't work without changes
LD (R1, 0x6969), R0     ; Works
hlorenzi commented 3 years ago

Alriiiight, I think I finally got this sorted out! Both the original issue and your latest one. Could you check if everything now works as intended?

I made instruction matching not prioritize declaration order anymore, so we can rule that off if new issues arise.

About your latest issue, the assembler was really just plainly ignoring the , R1 part. It was definitely a parsing bug!

Also, you mentioned refraining from using + and - after instruction parameters, but I've made parsing aware of this since v0.11! Please try to use them when you have the chance!

ProxyPlayerHD commented 3 years ago

it seems to work now, nice work man!

also i'm scared of using + and - in the instructions because it might become confusing when you already use them for the Address part, or if the original CPU (like the 6502) uses commas in the first place