enjoy-digital / litedram

Small footprint and configurable DRAM core
Other
375 stars 120 forks source link

Why is my simulation different to my target in regard to "address endianness"? #251

Open nickoe opened 3 years ago

nickoe commented 3 years ago

This is probably more of a "support" question. :S

Background

I am trying to learn how I can stream data from sdram to something else, probably running at a lower clock rate.

Right now I am loading some test data in memory with boot.json together with my bare metal slightly modified demo.bin. This modification includes a small function to dump the contents from memory such that I can verify it looks like I expect. The code snippet looks as follows:

static void dump_data()
{
    unsigned int base_addr = 0x41000000;
    unsigned int addr;

    uint32_t *data = base_addr;
    uint32_t i;
    printf("\nAddress    Raw Word   I    Q\n");
    for (i = 0; i < 124; i=i+1) {
        addr = base_addr + i*4;
        printf("0x%08x ", *(uint32_t *)&addr);
        printf("0x%08x ", *(data+i) );
        printf("%*d %*d \n", 4, *(data+i) & 0x0000ffff, 4, (*(data+i) & 0xffff0000) >> 16);

    }
    printf("\nDone dumping.\n");
}

And I get data out as I expect it.

Address    Raw Word   I    Q
0x41000000 0x020003ff 1023  512 
0x41000004 0x02da03ce  974  730 
0x41000008 0x038a0346  838  906 
0x4100000c 0x03ef027f  639 1007 
0x41000010 0x03f601a0  416 1014 
0x41000014 0x039d00d4  212  925 
0x41000018 0x02f60040   64  758 
0x4100001c 0x02200002    2  544 
0x41000020 0x01440025   37  324 
0x41000024 0x008b00a2  162  139 
0x41000028 0x001a0162  354   26 
0x4100002c 0x00050240  576    5 
...

And I can clearly see that data in correct order in my simulation, but when I run on hardware, which is a custom board, but using a "premade" module artix7 and the ram. Specifically the Enclustra Mars AX3.

Right now I have found that one can use the stream stuff to easily insert a CDC (ClockDomainCrossing) buffer. This appear to perfect in simulation. Note I get the same output from my modified demo.bin in sim and on target, suggesting that the data is put in memory the same way -- leaving only the difference be how litedram accesses the memory.

Problem

Originally I found this discrepancey between the sim and target. image

Until it dawened upon me that that the sine wave just seems to have chunks of samples revered.

I get 4x32 bits from the DMA thing, but I get the lowest address last. Compare above dump with image

That is directly out of the DMA as I have wired it with self.output_sig2.eq(dma.source.data).

This is my code https://github.com/nickoe/litex-boards/blob/f3090247db0a7d26291c39860eede3a3aa46ca64/litex_boards/targets/mars_ax3_custom.py#L133-L152

I tried to use my own fsm for the LiteDRAMDMAReader first, and I now think I understand that pretty well, but I also tried the DMAReader from litevideo which appears to give me the same results. Leaving me with a difference in the hw somehow.

This is my platform io definition: https://github.com/nickoe/litex-boards/blob/f3090247db0a7d26291c39860eede3a3aa46ca64/litex_boards/platforms/mars_ax3.py#L45-L83

Where I added my own new SDRAM timing definitions. It passes the memory check in the bios. https://github.com/nickoe/litex-boards/blob/f3090247db0a7d26291c39860eede3a3aa46ca64/litex_boards/targets/mars_ax3.py#L88-L108

I hope someone could enlighten my as to why this happens. My expectation was to get the data out in the order I am addressing them in -- and not in reverse order.

nickoe commented 3 years ago

Ok, SOLUTION found. I asked in #symbiflow and got this response from andrewb1999 which was spot on.

This is happening because of the conversion from 128-bit native DRAM width to the 32-bit width requested here: https://github.com/nickoe/litex-boards/blob/f3090247db0a7d26291c39860eede3a3aa46ca64/litex_boards/targets/mars_ax3.py#L229 The solution is to add the parameter 'reverse=True' to the get port function call which will switch the order of 32-bit words when splitting a 128-bit word.

I still wonder why this is not shown in the simulation. I would like to know how I can make the simulation have the same behaviour.