Understanding READ Commands and Scratchpad Memory Behavior in `ddr5-tester`

biecho commented 1 month ago

Description: I am using the ddr5-tester and have some questions regarding the interaction between READ commands and the scratchpad memory. Specifically:

How many bytes are typically read into the scratchpad memory with each READ command?
The contents of the scratchpad memory after execution differ from what I expected. Below is a snippet of my code and the output I’m seeing:

encoder = Encoder(bankbits=settings.geom.bankbits, nranks=settings.phy.nranks)
PAYLOAD = [
    encoder(OpCode.NOOP, timeslice=50),
    encoder(OpCode.ACT, timeslice=20, address=encoder.address(rank=0, bank=0, row=100)),
    encoder(OpCode.READ, timeslice=20, address=encoder.address(rank=0, bank=0, col=0)),
    encoder(OpCode.NOOP, timeslice=50),
]

def execute(wb):
    program = [w for w in PAYLOAD]
    program += [0] * (wb.mems.payload.size // 4 - len(program))  # fill with NOOPs

    # Write some data to the column we are reading to check that scratchpad gets filled
    converter = DRAMAddressConverter.load()
    data = [0xaaaaaaaa] * 128
    memwrite(wb, data, base=converter.encode_bus(bank=0, row=100, col=0))

    print('\nTransferring the payload ...')
    memwrite(wb, program, base=wb.mems.payload.base)

    def ready():
        status = wb.regs.payload_executor_status.read()
        return (status & 1) != 0

    print('\nExecuting ...')
    assert ready()
    wb.regs.payload_executor_start.write(1)
    while not ready():
        time.sleep(0.001)

    print('Finished')

    print('\nScratchpad contents:')
    scratchpad = memread(wb, n=512 // 4, base=wb.mems.scratchpad.base)
    memdump(scratchpad, base=0)

if __name__ == "__main__":
    wb = RemoteClient()
    wb.open()
    print("Board info:", read_ident(wb))

    execute(wb)

    wb.close()

Output:

Scratchpad contents:
0x00000000:  aa aa aa aa af aa aa aa aa aa aa aa aa aa aa aa  ................
0x00000010:  aa aa aa aa bb aa aa ba aa aa ba fa ff aa ea bb  ................
0x00000020:  ea fb bb ff ff ee ff ff ee ff bb ff ff ee ff ff  ................
0x00000030:  fe ff bb ff ff ee ff 7f ff ff bb ff ff ee ff 77  ...............w
0x00000040:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x00000050:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x00000060:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x00000070:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x00000080:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x00000090:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x000000a0:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x000000b0:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x000000c0:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x000000d0:  ff ff bb ff ff cc ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x000000e0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x000000f0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000100:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000110:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000120:  ff ff bb ff ff ce ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x00000130:  ff ff bb ff ff ec ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000140:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000150:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000160:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000170:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x00000180:  ff ff bb ff ff ce ff 77 ff ff bb ff ff cc ff 77  .......w.......w
0x00000190:  ff ff bb ff ff cc ff 77 ff ff bb ff ff ec ff 77  .......w.......w
0x000001a0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x000001b0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x000001c0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x000001d0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x000001e0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w
0x000001f0:  ff ff bb ff ff ee ff 77 ff ff bb ff ff ee ff 77  .......w.......w

We see that all bytes are aa at the following addresses:

0x00000000:  aa aa aa aa af aa aa aa aa aa aa aa aa aa aa aa  ................
0x00000010:  aa aa aa aa bb aa aa ba aa aa ba fa ff aa ea bb  ................

Could you help me understand what I might be doing wrong? Am I misinterpreting the interaction between the memory and READ commands?

tmichalak commented 1 month ago

Hi @biecho, thanks for reporting the issue

At the first look your snippet looks correct, but let me run some tests with it and get back to you.

Looking at your output, the reads above offset 0x20 looks like reads from a memory space where there is no memory (scratchpad). Can you share the commands you used to build the design? Also, if you can share your built project, we could run it locally and examine the behaviour. You can export the build results with the make pack command.

biecho commented 3 weeks ago

Hi @tmichalak, thanks for the response.

I wanted to mention that I'm working on a fork of the project, so using make pack might not be applicable in my case. Could you reproduce the issue with the tests you mentioned?

tmichalak commented 3 weeks ago

@biecho running make pack should succeed regardless of whether you're using a fork - it's just a target in the Makefile

biecho commented 3 weeks ago

Yes, you're right. Here we go:

ddr5_tester-main-d23a7631.zip

biecho commented 3 weeks ago

Do we have any updates on this?

tmichalak commented 3 weeks ago

Would it be possible to point me to the fork that you built the bitstream from? Also what is the exact model of the memory module you are using? I am assuming that the memory training passed, but I would appreciate if you sent us some logs from the training as well.

biecho commented 3 weeks ago

I'll come back you with answers to your questions. However, were you able to reproduce it in your test setup building from the latest commit?

biecho commented 2 weeks ago

Here I send you the logs generated from building and testing the module. test_log.txt build_log.txt

biecho commented 2 weeks ago

I was able to reproduce this issue on the latest commit of the main branch.

Here is the execute_payload_read.py code that I executed.

#!/usr/bin/env python3

import time

from rowhammer_tester.gateware.payload_executor import Encoder, OpCode
from rowhammer_tester.scripts.utils import memdump, memread, memwrite, DRAMAddressConverter, RemoteClient, read_ident, \
    get_litedram_settings

# Sample program

litedram_settings = get_litedram_settings()
bankbits = litedram_settings._settings['geom']['bankbits']
encoder = Encoder(bankbits=bankbits)
PAYLOAD = [
    encoder(OpCode.NOOP, timeslice=50),
    encoder(OpCode.ACT, timeslice=20, address=encoder.address(bank=0, row=0)),
    encoder(OpCode.READ, timeslice=20, address=encoder.address(bank=0, col=0)),
    encoder(OpCode.NOOP, timeslice=50),
]

def execute(wb):
    program = [w for w in PAYLOAD]
    program += [0] * (wb.mems.payload.size // 4 - len(program))  # fill with NOOPs

    # Write some data to the column we are reading to check that scratchpad gets filled
    converter = DRAMAddressConverter.load()
    data = [0xaaaaaaaa] * 128
    memwrite(wb, data, base=converter.encode_bus(bank=0, row=0, col=0))

    print('\nTransferring the payload ...')
    memwrite(wb, program, base=wb.mems.payload.base)

    def ready():
        status = wb.regs.payload_executor_status.read()
        return (status & 1) != 0

    print('\nExecuting ...')
    assert ready()
    wb.regs.payload_executor_start.write(1)
    while not ready():
        time.sleep(0.001)

    print('Finished')

    print('\nScratchpad contents:')
    scratchpad = memread(wb, n=512 // 4, base=wb.mems.scratchpad.base)
    memdump(scratchpad, base=0)

if __name__ == "__main__":
    wb = RemoteClient()
    wb.open()
    print("Board info:", read_ident(wb))

    execute(wb)

    wb.close()

I found something interesting and noticed the same pattern as described earlier:

0x00000000:  aa aa aa aa bf aa aa aa aa aa aa aa aa aa aa aa  ................
0x00000010:  aa aa aa aa bb aa aa ba aa aa aa ba ff aa ea bb  ................

For your reference, I've attached both the build_log.txt and the litedram_settings.json.

Have you made any progress on this issue?

build_log.txt litedram_settings.json

tmichalak commented 2 weeks ago

Thanks @biecho for the logs and script. I ran the scipt on the board with the MTC10F1084S1RC48BA1R module and bitstream from the main branch, but until now I see :

0x00000000:  ff 77 ff ff ff bb ff ff ff 77 ff ff ff bb ff ff  .w.......w......
0x00000010:  ff 77 ff ff ff bb ff ff ff 77 ff ff ff bb ff ff  .w.......w......

so I need to dig a bit deeper to see what is happening here...

biecho commented 2 weeks ago

Thank you for looking into this. I might have something that is related.

In Encoder.I.address, we have a way to translate rank, bank, row, and col into an address. Here’s the relevant method:

def address(self, *, rank=None, bank=0, row=None, col=None):
    assert not (row is not None and col is not None)
    if row is not None:
        rowcol = row
    elif col is not None:
        rowcol = col
    else:
        rowcol = 0
    address = bank & (2**self.bankbits - 1)
    address |= (rowcol) << self.bankbits
    if self.nranks > 1:
        address <<= log2_int(self.nranks)
        address |= rank
    return address

However, we also have DRAMAddressConverter.encode_bus, which performs a similar function but with additional considerations for the bus width and base address:

def encode_bus(self, *, bank, row, col, base=0x40000000, bus_width=32):
    shift = self._get_bus_shift(bus_width)
    address = self._encode(bank, row, col)
    if shift > 0:
        address <<= shift
    else:
        address >>= -shift
    return base + address

How are these related? How can I decode the address in the Encoder.I. There is only a function to encode, but not to decode.

biecho commented 2 weeks ago

Should these two methods result in the same address?

bank = 0
row = 7477

address_encoder = encoder.address(bank=bank, row=row)
address_converter = converter.encode_bus(bank=bank, row=row, col=0, base=0)

print(f"address_encoder: {hex(address_encoder)}")
print(f"address_converter: {hex(address_converter)}")

bank, row, col = converter.decode_bus(address_converter, base=0)
print(f"Decoded from converter: bank={bank}, row={row}, col={col}")

bank, row, col = converter.decode_bus(address_encoder, base=0)
print(f"Decoded from encoder: bank={bank}, row={row}, col={col}")

Interestingly, the output is as follows:

address_encoder: 0x3a6a0
address_converter: 0x74d40000
Decoded from converter: bank=0, row=7477, col=0
Decoded from encoder: bank=29, row=0, col=424

Looking at the binary:

0x3a6a0                   -> 0 0111 0100 1101 0100 000
0x74d40000            -> 0111 0100 1101 0100 0000 0000 0000 0000
After bus shift of 2  -> 0001 1101 0011 0101 0000 0000 0000 0000
row = 0001 1101 0011 0101 = 7477
bank and col are both 0

It appears that the address produced by the encoder is similar to the one generated by the converter, but there is a shift discrepancy. Could there be a missing shift in the encoder logic? Alternatively, can we consistently use the converter when specifying the DRAM address for reading?

biecho commented 2 weeks ago

I attempted to work consistently with the following address encoding:

address = converter.encode_bus(bank=0, row=0, col=0, base=0)

Using this approach, I executed the following program:

# Sample program
encoder = Encoder(bankbits=5)
converter = DRAMAddressConverter.load()
PAYLOAD = [
    encoder(OpCode.NOOP, timeslice=60),
    encoder(OpCode.PRE, timeslice=60, address=converter.encode_bus(bank=0, row=0, col=0, base=0)),
    encoder(OpCode.ACT, timeslice=60, address=converter.encode_bus(bank=0, row=0, col=0, base=0)),
    encoder(OpCode.READ, timeslice=60, address=converter.encode_bus(bank=0, row=0, col=0, base=0)),
    encoder(OpCode.NOOP, timeslice=60),
]

def execute(wb):
    program = [w for w in PAYLOAD]
    program += [0] * (wb.mems.payload.size // 4 - len(program))  # Fill with NOOPs

    # Write some data to the column we are reading to check that the scratchpad gets filled
    data = [0xaaaaaaaa] * 128
    memwrite(wb, data, base=converter.encode_bus(bank=0, row=0, col=0))

    print('\nTransferring the payload ...')
    memwrite(wb, program, base=wb.mems.payload.base)

    def ready():
        status = wb.regs.payload_executor_status.read()
        return (status & 1) != 0

    print('\nExecuting ...')
    assert ready()
    wb.regs.payload_executor_start.write(1)
    while not ready():
        time.sleep(0.001)

    print('Finished')

    print('\nScratchpad contents:')
    scratchpad = memread(wb, n=512 // 4, base=wb.mems.scratchpad.base)
    memdump(scratchpad, base=0)

if __name__ == "__main__":
    wb = RemoteClient()
    wb.open()
    print("Board info:", read_ident(wb))

    execute(wb)

    wb.close()

After running the code, the scratchpad contents were as follows:

Scratchpad contents:
0x00000000:  aa aa aa aa bf aa aa aa aa aa aa aa aa aa aa aa  ................
0x00000010:  aa aa aa aa bb aa aa ba aa aa ba fa ff aa ea bb  ................

However, when using the pattern 0xffffffff, the scratchpad contents were:

Scratchpad contents:
0x00000000:  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................
0x00000010:  ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff  ................

I see the same error again with 0xaaaaaaaa, but with 0xffffffff it appears to work.

antmicro / rowhammer-tester

Understanding READ Commands and Scratchpad Memory Behavior in `ddr5-tester` #183