ZipCPU / arrowzip

A ZipCPU based demonstration of the MAX1000 FPGA board
21 stars 5 forks source link

output of memtest #6

Closed NeuerUser closed 5 years ago

NeuerUser commented 5 years ago

If I read the memtest prog correctly, it should turn on 5 of the 8 LEDs, right? Well, after starting, the program just turns on all 8 LEDs. No output on telnet or netuart, which is to be expected, right?

$ time ./zipload -v -r ../board/memtest
ZipLoad: Verbose mode on
Halting the CPU
Loading: ../board/memtest
Writing to MEM: 00c00000-00c0039c
Clearing the CPUs registers
Setting PC to 00c00000
Starting the CPU
CPU Status is: 0000200f

real    0m4.846s
user    0m1.380s
sys     0m3.431s

Loading is pretty fast.

ZipCPU commented 5 years ago

If all goes well in the memtest, the LEDs should go as follows:

  1. *_spio = 0x0f00; // Start out by turning off the bottom four LEDs

    The design then writes to all memory values, sequentially, as though they were integers. It then comes back and reads all of these integers. Once complete, it sets the LEDs and moves to the next test.

  2. *_spio = 0x0f01; // Turn on the bottom LED

    This is to indicate that the first test passed.

    The next test is a repeat of the first test, save that triplets are read/written to memory at the same time. This is designed to test memory pipelining--something that requires the pipeline (which isn't turned on) and pipeline memory access. Hence, this test should succeed or fail the same as the first.

  3. *_spio = 0x0f03; // Turn on the bottom two LEDs

    We now go through the memory 3 8-b characters at a time, setting the memory to random values, and checking them three characters (not 32-b integers)

  4. *_spio = 0x0f07; // Set the bottom three LEDs

    The next test is a random access test, randomly writing pseudorandom data through memory

  5. At the end of this sequence, we toggle the fifth bit and repeat.

If at any time in this process, the SDRAM fails to return an expected value, all LEDs are turned on and the CPU halts. You can tell that the CPU has halted by running "zipstate". You can find out why/how it halted, and what the problem was, by first running "make memtest.txt" in sw/board to get disassembled copy of the memtest program. Second, you'll want to run zipdbg to get the state of the processor when it failed. The user-space registers will indicate where in the process the program failed.

My guess is that you are struggling with the PLL generation. The PLL, known as genpll needs to create two clocks, one at 80MHz and a second one 90 degrees offset from it.

NeuerUser commented 5 years ago

Hi Dan

Now it gets interesting :) Unfortunately, I guess I don't have the knowledge as a beginner here. This is what I got:

$ time ./zipload -v -r ../board/memtest
ZipLoad: Verbose mode on
Halting the CPU
Loading: ../board/memtest
Writing to MEM: 00c00000-00c0039c
Clearing the CPUs registers
Setting PC to 00c00000
Starting the CPU
CPU Status is: 0000200f

real    0m4.848s
user    0m1.443s
sys     0m3.370s
$ ./zipstate 
0x0000100f: BUSY SW-HALT
$ ./zipdbg
Peripherals                   CPU State: 0x00001613 Halted
>PIC > 0x00c00030<   WDT : 0x00400000    WBUS: 0x88000004    PIC2: 0x00000000
 TMRA: 0x00000000    TMRB: 0x00000000    TMRC: 0x00000000    JIF : 0x00000000
 MTSK: 0x00000000    MOST: 0x00000000    MPST: 0x00000000    MICT: 0x00000000

Supervisor Registers
 sR0 : 0x00c00030    sR1 : 0x00400000    sR2 : 0x88000004    sR3 : 0x00000000 
 sR4 : 0x00000000    sR5 : 0x00000000    sR6 : 0x00000000    sR7 : 0x00000000 
 sR8 : 0x00000000    sR9 : 0x00000000    sR10: 0x00000000    sR11: 0x00000000 
 sR12: 0x00000000    sSP : 0x00c07800    sCC :TPHLT          sPC : 0x00c00394 
User Registers
 uR0 : 0x01800000    uR1 : 0x000485b5    uR2 : 0x02000000    uR3 : 0x000485b5
 uR4 : 0x8e648e64    uR5 : 0xffe00000    uR6 : 0x00000001    uR7 : 0x01800000
 uR8 : 0x00800010    uR9 : 0x00000000    uR10: 0x00000f00    uR11: 0x00000000
 uR12: 0x00000000    uSP : 0x00c07fd8    uCC :TP             uPC : 0x00c000c8

                                                     >00c07800 0x00000000
 0x00c0039c 0x00000000  SUB        $0,R0              00c07804 0x00000000
 0x00c00398 0x7b400000  RTN                           00c07808 0x00000000
>0x00c00394 0x68800800  ADD        $2048,SP           00c0780c 0x00000000
 0x00c00390 0x70c00010  HALT                          00c07810 0x00000000

The strange thing is that 0x00c00038 is (at least not for me) a reasonable PC, as I cannot see there any HALT (or similar) instruction:

memtest:     file format elf32-zip

Disassembly of section .ramcode:

00c00000 <_start>:
  c00000:       86 00 8e 00     CLR        R0             | CLR        R1
  c00004:       96 00 9e 00     CLR        R2             | CLR        R3
  c00008:       a6 00 ae 00     CLR        R4             | CLR        R5
  c0000c:       b6 00 be 00     CLR        R6             | CLR        R7
  c00010:       c6 00 ce 00     CLR        R8             | CLR        R9
  c00014:       d6 00 de 00     CLR        R10            | CLR        R11
  c00018:       66 00 00 00     CLR        R12
  c0001c:       6a 01 03 00     LDI        0x00c08000,SP  // c08000 <_top_of_stack>
  c00020:       6a 40 80 00 
  c00024:       76 00 00 00     TRAP
  c00028:       87 fa fc f8     JSR        0x00c002d4     // c002d4 <entry>
  c0002c:       00 c0 02 d4 

00c00030 <busy_failure>:
  c00030:       78 83 ff fc     BUSY

00c00034 <runtest>:
  c00034:       e8 24 85 00     SUB        $36,SP         | SW         R0,(SP)
  c00038:       ad 04 b5 08     SW         R5,$4(SP)      | SW         R6,$8(SP)
  c0003c:       bd 0c c5 10     SW         R7,$12(SP)     | SW         R8,$16(SP)
  c00040:       cd 14 d5 18     SW         R9,$20(SP)     | SW         R10,$24(SP)
  c00044:       dd 1c e5 20     SW         R11,$28(SP)    | SW         R12,$32(SP)
  c00048:       42 00 01 00     LDI        0x00800010,R8  // 800010 <_kram+0x800010>
  c0004c:       42 40 00 10 
  c00050:       06 00 ff 00     LDI        $65280,R0
  c00054:       04 c6 00 00     SW         R0,(R8)
  c00058:       02 00 02 00     BREV       $512,R0
  c0005c:       8e 04 8d 80     LDI        $4,R1          | SW         R1,(R0)

I did the test two times. In between I did a "make clean", "make" and "make memtest.txt", to ensure that everything has been rebuild.

Looking at Quartus, I cannot see anything in the verilog concerning the pll that differs from your repo. (I only added files for the altera_lite_gpio.) Also ......

SORRY, JUST FOUND THE PROBLEM!!!! (Completely stupid!!!) I did not update my qsf file after you enabled the SDRAM, so there are NO PINS defined for the connection to the SDRAM. (So stupid from me!!!!). I will just do that now and then retest it.

NeuerUser commented 5 years ago

So, that HELPED! Now the memtest is working. :)

The third led is not visible, however. It is possible that it is turned on and directly after that off again. The fifth one is toggling slowly.

ZipCPU commented 5 years ago

Yeah, the 3rd LED gets turned off so fast that you don't notice it much. You will notice that all the LEDs don't turn on, and that the 5th LED toggles (slowly)--indicating success.

But let's talk a moment about that 0x0c00038 address you were looking at. I'm not sure where you found it from. If you look left of the HALT, you'll see that the CPU halted at instruction 0xc00390. The SPC (Supervisor program counter) points to the next instruction, 0x0c00394. Most of the program runs in user mode, after the supervisor issues the RTU command (zip_rtu() in the code). That's where the test is run. Upon any detection of a failure, the CPU issues a "TRAP" function to switch to supervisor mode. Hence, the registers you really want to be looking at are the u* registers.

If you look at the uPC register, you'll see that the program stopped at 0xc00c8. This is one past the TRAP instruction which sent us back to supervisor space. (Yes, the *PC points to the next instruction, rather than the one just executed.)

If you back up a bit further, you'll see a BZ instruction (branch on zero). This follows the test that the value read from memory equals the value written. If they are equal, the CPU will jump to 0x0c00c8 and skip the TRAP instruction.

Backing up two more, you'll see where the CPU reads from memory (LW (R0), R4, or load a word from address R0 into register R4), followed by a compare (CMP R3,R4) comparing registers R3 with R4.

This means you can look at uR0, uR3, and uR4 (remember, this part is in user mode) to see what's going on. uR0 = 0x01800000, which is the very first address in RAM. This means we failed on the very first check, much like what you discovered with a bad pinout. If you look at uR4, you get 0x8e648e64, the value we just read from memory. Now, remember how the memory is a 16'bit memory, but the bus is 32-bits? That means that to read from the memory we needed to do two separate reads and piece them together. This value shows that, of those two separate reads, they both read the same value. This is again consistent with a bad pinout, although it could have been other things. What was the value supposed to be? That's in R3, and it is set at 0x0485b5.

If you look back any further, you'll see a MOV R6,R3. That moves the "counts" value (R6==1) into the LFSR fill register kept in R3. LDI 0x01800000 is a long (two word) instruction to load 0x01800000 (the address of the beginning of memory) into R0. LSR 1,R3 is the first of a two step instruction to run the LFSR, basically shifting all the bits right (R3 >>= 1). Remember, R3 was just set to R6, and so it starts out as one. Shifting it right by one turns it into a zero. Unlike many other architectures I know of, this also sets the carry bit to what used to be R3&1. That means that, on the next instruction, if the carry was set (it was), I exclusive or R1 (the taps register, 0x0485b5) with R3 (the fill register which had just turned zero) to get the memory value 0x485b5.

LFSR's psudorandom number generators. They aren't true random number generators, but they tend to be low cost and (in this case) repeatable. Here, I start with "counts" and run the LFSR to generate a bunch of random numbers that are then written to the memory, and then repeat the random sequence when reading the results back from memory in order to check that I got the right values back.

In all, I'm excited for you, and glad you got it running.

Dan

NeuerUser commented 5 years ago

Hi Dan

Thanks for the great explanation. Just for completeness, I found the 0x00C000c8 in the above shown screenshot of zipdbg as the uPC.

I am also very excited. It's so great to have this fascinating full soc on that small board and being able to learn how it works, and finally even being able to modify it. Thanks a lot for publishing this!

(I will probably open some other issues in the next days, e.g. if I have problems, questions or suggestions.)