arduino / ArduinoCore-mbed

330 stars 195 forks source link

SDRAM memory write corruption #797

Open MoveElectricMobility opened 9 months ago

MoveElectricMobility commented 9 months ago

Under the latest core writing to SDRAM a simple struct of size 1112 bytes aligned to 8 bytes using " attribute((aligned(8)));" then reading it back is returning memory bit flip issues. SDRAM.begin(); is used as well as a proper SDRAM.malloc(); is used -> data_storage_sdram = (DATA_FRAME_RETURN) SDRAM.malloc((sizeof(struct DATA_FRAME_RETURN) MAX_NUMBER_OF_ELEMENTS_SDRAM) + 256);. I have printed out the before and after at the byte level in hex for you to look at below: (I have also tried this on all 6 Gigas I own with exactly the same results)

0 0 60000325
0 0 60000326
0 0 60000327
0 0 60000328
0 0 60000329
8 8 6000032a
40 40 6000032b

fc fc 6000032c
a9 a9 6000032d
f1 f1 6000032e
d2 d3 6000032f  -----*
4d 4d 60000330
62 62 60000331
50 50 60000332
3f 3f 60000333

0 0 60000334
0 0 60000335
0 0 60000336
0 0 60000337
0 0 60000338
0 0 60000339
2e 2e 6000033a
40 40 6000033b

c8 c8 6000033c
0 0 6000033d
0 c8 6000033e  -----*
0 0 6000033f
0 0 60000340
0 0 60000341
0 0 60000342
0 0 60000343

0 0 60000344
0 0 60000345
0 0 60000346
0 0 60000347
0 0 60000348
0 0 60000349
8 8 6000034a
40 40 6000034b

fc fc 6000034c
a9 a9 6000034d
f1 f1 6000034e
d2 d3 6000034f  -----*
4d 4d 60000350
62 62 60000351
50 50 60000352
3f 3f 60000353

0 0 0x60000384
0 0 0x60000385
0 0 0x60000386
0 0 0x60000387
0 0 0x60000388
0 0 0x60000389
54 54 0x6000038a
40 40 0x6000038b

fa fa 0x6000038c
7e 7e 0x6000038d
6a fa 0x6000038e  -----*
bc bc 0x6000038f
74 74 0x60000390
93 93 0x60000391
68 78 0x60000392  -----*
3f 3f 0x60000393

0 0 0x60000394
0 0 0x60000395
0 0 0x60000396
0 0 0x60000397
0 0 0x60000398
94 94 0x60000399
c1 c1 0x6000039a
40 40 0x6000039b

96 96 0x6000039c
0 0 0x6000039d
0 92 0x6000039e  -----*
0 0 0x6000039f
0 0 0x600003a0
0 0 0x600003a1
0 0 0x600003a2
0 0 0x600003a3

0 0 0x600003a4
0 0 0x600003a5
0 0 0x600003a6
0 0 0x600003a7
0 0 0x600003a8
0 0 0x600003a9
4e 4e 0x600003aa
40 40 0x600003ab

7b 7b 0x600003ac
14 14 0x600003ad
ae fe 0x600003ae  -----*
47 06 0x600003af  -----*
e1 e1 0x600003b0
7a 7a 0x600003b1
b4 f4 0x600003b2  -----*
3f 3f 0x600003b3

0 0 0x600003b4
0 0 0x600003b5
0 0 0x600003b6
0 0 0x600003b7
0 0 0x600003b8
40 40 0x600003b9
8f 8f 0x600003ba

AND

0 0
0 0
0 0
0 0
0 0
8 8
40 40

fc fc
a9 a9
f1 f9 1111[0]001 -> 1111[1]001
d2 e3 11[01]001[0] -> 11[10]001[1]
4d 4d 
62 62
50 58 0101[0]000 -> 0101[1]000
3f 3f

0 0
0 0
0 0
0 0
0 0
0 0
2e 2e
40 40

c8 c8
0 0
0 c8 00000000 -> 11001000
0 0
0 0
0 0
0 0
0 0

0 0
0 0
0 0
0 0
0 0
0 0
8 8
40 40

fc fc
a9 a9
f1 f9 1111[0]001 -> 1111[1]001
d2 e3 11[01]001[0] -> 11[10]001[1]
4d 4d
62 62
50 50
3f 3f

Any ideas why this is happening or what can be done to fix it.

facchinm commented 9 months ago

@MoveElectricMobility can you share a real sketch that can be used to reproduce the issue and the core version that's affected?

MoveElectricMobility commented 9 months ago

@facchinm I manually lowered through a lot of low level HAL code and some clever compiler tricks to get around your compile chain, the speed of the SDRAM from 100Mhz to 80Mhz and the issues disappeared, current core plus running 100Mhz SDRAM plus many large and very frequent single writes of thousands of bytes (250 writes per second of either 1112 or 1756 bytes different tests) results in random corruption, I think this might be a hardware issue.... I've got the electrical engineers on my team working the problem currently. Next step is we might try swapping out the esmt m12l64164a SDRAM chip with something else if possible, I'll keep you posted. I would suggest maybe in the next core revision or SDRAM library revision lowering the sdram frequency or maybe increase voltage.