emu-russia / dmgcpu

DMG CPU Reverse Engineering
Creative Commons Zero v1.0 Universal
29 stars 4 forks source link

JR opcode #277

Closed ogamespec closed 5 months ago

ogamespec commented 5 months ago

Conditional jump (JR) is not working.

Test rom:

// Check conditional jump

// https://www.asm80.com/onepage/asmz80.html

//0000   00                     NOP   
//0001   37                     SCF   
//0002   38 03                  JR   c,label1
//0004   00                     NOP   
//0005   00                     NOP   
//0006   00                     NOP   
//0007                LABEL1:   
//0007   76                     HALT 

00
37
38 03
00
00
00

@0007
76

Waves:

image

ogamespec commented 5 months ago

It seems that the result of the ALU calculation is not saved on the ebus/fbus between cycles:

image

image

The picture above is not actually inverters, but a transparent DLatch for ebus/fbus buses

Rodrigodd commented 5 months ago

Unfortunately, didn't investigate this today as much as I expected, but the problem appears to be in the DataMux, it is not putting Res into DL.

Below, x[37] is s3_oe_alu_to_pbus, which control Res_to_DL in the DataMux. I believe that should put the result of Res in DL, which then goes to Z_in, to reg Z, and then to zbus to PCL. Similar thing should happen with W to PCH, but didn't check.

But I didn't investigate this thoroughly, could be wrong.

image

ogamespec commented 5 months ago

Yep, it's a tough nut to crack. I can't figure out the best way to approach it yet. Everything seems to be correct, but the signals aren't coming out.

Rodrigodd commented 5 months ago

Look a little more into it today. Appears to be a issue in how WR_hack is currently implemented.

From investigation.md:


JR d16

Jump relative is not jumping to the correct address. The jump relative should use the ALU to compute the target address, and them set the value of PC to its result. What appears to be happening is a bug in the DataMux, where it is not putting the value of Res into DL (internal databus).

The DataMux has currently the following logic (assuming CLK is high, and Test1 is low):

assign Ext_bus = (WR_hack && ~Test1 && clk) ? int_to_ext_q : 1'bz;
assign Int_bus = (~WR_hack && ~Test1 && clk) ? ext_to_int_q :
    ( Res_to_DL ? res_q : ( DataOut ? dv_q : 1'bz ) );
WR Res_to_DL DataOut Operation
1 1 x D <- DL; DL <- Res
1 0 0 D <- DL
1 0 1 D <- DL; DL <- DV
0 x x DL <- D

I think the first line is invalid, DL is being written and read at the same time, while I expect that another entity must be already driving DL. In no other case the content of Res is put into DL.

During the execution of the instruction JR d16, WR is low and Res_to_DL is high.

Rodrigodd commented 5 months ago

Investigate the DataMux logic more in depth, and I think I got it to work. One way is to replicate the exact logic of the circuit, and then make all internal connections have a smaller driving strength than the external bus to solve any conflict, and an even smaller signal pulling the buses up, to simulate the pre-charging:

assign Ext_bus = (clk || Test1) ? 1'bz : 1;
assign(pull0, pull1) Ext_bus = clk && ~(int_to_ext_q || Test1) ? 0 : 1'bz;

assign Int_bus = clk ? 1'bz : 1;
assign(pull0, pull1) Int_bus = clk && ~(Test1 || ext_to_int_q) ? 0 : 1'bz;
assign(pull0, pull1) Int_bus = (Res_to_DL && ~Res) ? 0 : 1'bz;

// DataBridge logic
assign(pull0, pull1) Int_bus = (~dv_q) ? (DataOut ? 0 : 1'bz) : 1'bz;

// Drive DL and D buses up with weak strength to simulate the precharge.
assign (weak0, weak1) Ext_bus = 1;
assign (weak0, weak1) Int_bus = 1;

One problem with that approach is that it does not work in Verilator (although I'm not sure if it is currently working in the first place), apart from making the logic hard to read.

Another approach is to introduce the same hack you did before with the WR_hack, but also with a RD_hack signal too:

assign Ext_bus = ~clk ? 1
        : RD_hack && ~WR_hack ? 1'bz
        : ~RD_hack && WR_hack ? int_to_ext_q
        : 1;

assign Int_bus = ~clk ? 1
        : RD_hack && ~WR_hack ? ext_to_int_q
        : ~RD_hack && WR_hack && DataOut ? dv_q
        : ~RD_hack && WR_hack ? 1'bz
        : Res_to_DL ? res_q
        : 1;

This one also works, but it introduces some undefined values into Z_in when Int_bus temporarily goes z, but does not affect the simulation (but didn't try it in longer simulations). But this could be fixed by simulating the precharge, as in the previous approach (which would also break the Verilator simulation).

@ogamespec What do you think is the better approach? I can make a PR with either of them.

With the fix above I was able to simulate the entirety of quickboot.bin 🎉! I believe the next step is to run blargg's cpu_instrs test rom.

Bellow are my notes in investigation.md:

## DataMux In `datamux.md`, the logisim of the `DataMux` describes the following Verilog: ```verilog assign Ext_bus = ~Test1 ? (clk ? int_to_ext_q : 1) : 1'bz; assign Int_bus = Res_to_DL ? res_q : ( clk ? (~Test1 ? ext_to_int_q : 1'bz) : 1); assign Int_bus = (~dv_q) ? (DataOut ? 0 : 1'bz) : 1'bz; ``` The logic above has a conflict when `DataOut=0 && dv_q=0` (driving `Int_bus` to 0), and `(Res_to_DL=1 & req_q=1) | (Res_to_DL=0 & clk=0) | (Res_to_DL=0 & clk=1 & Test1=0 & ext_to_int_q=1)` (driving `Int_bus` to `1`). Looking at the underlining transistors logic (from ogamespec diagram, I can't read the silicon traces), the logic is the following Verilog: ```verilog assign Ext_bus = (clk || Test1) ? 1'bz : 1; assign Ext_bus = int_to_ext_q || Test1 ? 1'bz: 0; assign Int_bus = clk ? 1'bz : 1; assign Int_bus = clk && ~(Test1 || ext_to_int_q) ? 0 : 1'bz; assign Int_bus = (Res_to_DL && ~Res) ? 0 : 1'bz; ``` The circuit has even more conflicts. We can represent it in the following tables: `Test1` | `clk` | `int_to_ext_q` | `=Ext_bus` --- | --- | --- | --- 0 | 0 | 0 | 1,0 = x 0 | 0 | 1 | 1,z = 1 0 | 1 | 0 | z,0 = 0 0 | 1 | 1 | z,z - z 1 | - | - | 1,z = 0 `Test1` | `clk` | `ext_to_int_q` | `Res_to_DL` | `Res` | `=Int_bus` --- | --- | --- | --- | --- | --- 0 | 0 | - | 0 | - | 1,z,z = 1 0 | 0 | - | 1 | 0 | 1,z,0 = - 0 | 0 | - | 1 | 1 | 1,z,z = 1 0 | 1 | 0 | 0 | - | z,0,z = 0 0 | 1 | 0 | 1 | 0 | z,0,0 = 0 0 | 1 | 0 | 1 | 1 | z,0,z = 0 0 | 1 | 1 | 0 | - | z,z,z = z 0 | 1 | 1 | 1 | 0 | z,z,0 = 0 0 | 1 | 1 | 1 | 1 | z,z,z = z 1 | 0 | - | 0 | - | 1,z,z = 1 1 | 0 | - | 1 | 0 | 1,z,0 = x 1 | 0 | - | 1 | 1 | 1,z,z = 1 1 | 1 | - | 0 | - | z,z,z = 0 1 | 1 | - | 1 | 0 | z,z,0 = 0 1 | 1 | - | 1 | 1 | z,z,z = 0 We assume that those conflicts are resolved by the non-digital nature of the silicon circuit. First, I assume that `clk=0` will always drive the buses to `1` as pre-charged. I also assume that `Test1` will always be low, because it is only used when testing the circuit, I believe. With these assumptions, the table can be simplified to: `clk` | `int_to_ext_q` | `=Ext_bus` --- | --- | --- 0 | - | 1 1 | 0 | 0 1 | 1 | z `clk` | `ext_to_int_q` | `Res_to_DL` | `Res` | `=Int_bus` --- | --- | --- | --- | --- 0 | - | - | - | 1,-,- = 1 1 | 0 | 0 | - | z,0,z = 0 1 | 0 | 1 | 0 | z,0,0 = 0 1 | 0 | 1 | 1 | z,0,z = 0 1 | 1 | 0 | - | z,z,z = z 1 | 1 | 1 | 0 | z,z,0 = 0 1 | 1 | 1 | 1 | z,z,z = z This solve the conflicts, but there is still `z`s in the table. Due to the pre-charge of the buses, I suppose that the `z` will be resolved to `1`. I forget to include the `DataBridge` logic. Here it is: ```verilog assign Int_bus = (~dv_q) ? (DataOut ? 0 : 1'bz) : 1'bz; ``` `DataOut` | `dv_q` | `=Int_bus` --- | --- | --- 0 | 0 | z 0 | 1 | z 1 | 0 | 0 1 | 1 | z This logic is parallel to the previous one, so it will also create a conflict if this outputs 0 and the other outputs 1. But I suppose that `DataOut` will not be 1 when clk is 0, so no conflict will arise. But a conflict will arise when another component of the system tries to drive the external bus or internal bus. For example, when `R` (read signal) is high, the external bus will be driven by the memory, but the internal bus still drives it low. Here is a table with possible scenarios: Situation | Circuit | Expected --- | --- | --- RD=0,WR=0 | D <- DL↓, DL <- D↓ | - RD=1,WR=0 | D driven, D <- DL↓, DL <- D↓ | DL <- D RD=0,WR=1 | DL driven, D <- DL↓, DL <- D↓ | D <- DL RD=0,WR=0,Res_to_DL=1 | D <- DL↓, DL <- D↓ | DL <- Res RD=0,WR=1,DataOut=1 | DV driven, D <- DL↓, DL <- D↓, DL <- DV↓ | D <- DL, DL <- DV * `↓` means `x ? z : 0` For the circuit to work as expected, it should need to know the values of RD and WR, but it does not, it always connect both buses together. One way this may be working, is that the external driver is always stronger than the internal connections. Not sure how this is suppose to work in the actual silicon. We can try to simulate this behavior using Verilog's strength levels: ```verilog assign Ext_bus = (clk || Test1) ? 1'bz : 1; assign(pull0, pull1) Ext_bus = clk && ~(int_to_ext_q || Test1) ? 0 : 1'bz; assign Int_bus = clk ? 1'bz : 1; assign(pull0, pull1) Int_bus = clk && ~(Test1 || ext_to_int_q) ? 0 : 1'bz; assign(pull0, pull1) Int_bus = (Res_to_DL && ~Res) ? 0 : 1'bz; // DataBridge logic assign(pull0, pull1) Int_bus = (~dv_q) ? (DataOut ? 0 : 1'bz) : 1'bz; // Drive DL and D buses up with weak strength to simulate the pre-charge. assign (weak0, weak1) Ext_bus = 1; assign (weak0, weak1) Int_bus = 1; ``` And this works!! But this breaks Verilator (it does not support strength levels in module ports). Another way to simulate this is to add hack `RD` and `WR` signals to `DataMux`, and only simulate the high level behavior: ```verilog assign Ext_bus = ~clk ? 1 : RD_hack && ~WR_hack ? 1'bz : ~RD_hack && WR_hack ? int_to_ext_q : 1; assign Int_bus = ~clk ? 1 : RD_hack && ~WR_hack ? ext_to_int_q : ~RD_hack && WR_hack && DataOut ? dv_q : ~RD_hack && WR_hack ? 1'bz : Res_to_DL ? res_q : 1; ``` This also works, although in this situation the pre-charge is not simulated, making `Int_bus` go to `z`, temporarily introducing a `x` into `Z_in` and `W_in`, but that does not affect the simulation.
ogamespec commented 5 months ago

Hi! I can't tell you how cool it is :) I think the RD hack is the better option. Even though it's a hack, it's at least more or less close to the synthesized version. The DataMux circuit itself is very tricky, so it's unlikely to be without some workarounds.

ogamespec commented 5 months ago

Fixed by @Rodrigodd