linux4sam / at91bootstrap

Second level bootloader for Microchip SoC (aka AT91)
https://www.linux4sam.org/linux4sam/bin/view/Linux4SAM/AT91Bootstrap4
114 stars 232 forks source link

CONFIG_NAND_TIMING_MODE not working on sam9x60 custom board #174

Open LeSpocky opened 8 months ago

LeSpocky commented 8 months ago

After successfully evaluating the sam9x60-curiosity board with at91bootstrap v4.0.6 and booting from NAND flash we based our own design on the D5M variant of the sam9x60 SiP. The raw NAND flash chip we are using is a Spansion® SLC NAND Flash Memory S34ML02G1, which has different page size and spare area size than the MX30LF4G28AD used on the curiosity board. at91bootstrap configuration (see .config) is based on sam9x60_curiositynf_uboot_defconfig and I get this on boot if CONFIG_NAND_TIMING_MODE is enabled:

AT91Bootstrap 4.0.8 (2020-08-01 00:00:00)

NAND: ONFI flash detected
NAND: Manufacturer ID: 0x1 Chip ID: 0xda
NAND: Page Bytes: 2048, Spare Bytes: 64
NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
NAND: Switch to timing mode 3
NAND: Disable On-Die ECC
PMECC: version is: 0x102
PMECC: page_size: 2048, oob_size: 64, pmecc_cap: 8, sector_size: 512
NAND: Initialize PMECC params, cap: 8, sector: 512
NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
PMECC: sector bits = 15, bit 1 means corrupted sector, Now correcting...
Correct error bit in OOB @[#Byte 6,Bit# 5] 164 -> 132
Correct error bit @[#Byte 498,Bit# 5] 160 -> 128
Correct error bit @[#Byte 402,Bit# 5] 160 -> 128
Correct error bit @[#Byte 306,Bit# 5] 160 -> 128
Correct error bit @[#Byte 210,Bit# 5] 160 -> 128
Correct error bit @[#Byte 137,Bit# 5] 32 -> 0
Correct error bit @[#Byte 114,Bit# 5] 160 -> 128
PMECC: failed to correct corrupted bits!

If I disable CONFIG_NAND_TIMING_MODE loading the U-Boot image is successful like this:

AT91Bootstrap 4.0.8 (2020-08-01 00:00:00)

NAND: ONFI flash detected
NAND: Manufacturer ID: 0x1 Chip ID: 0xda
NAND: Page Bytes: 2048, Spare Bytes: 64
NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
NAND: Disable On-Die ECC
PMECC: version is: 0x102
PMECC: page_size: 2048, oob_size: 64, pmecc_cap: 8, sector_size: 512
NAND: Initialize PMECC params, cap: 8, sector: 512
NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
NAND: Done to load image

Notice the additional line "NAND: Switch to timing mode 3". The option was enabled with 8427813d282fb8bb0392820e01ab0a22ff6d48c9 for sam9x60 boards before release v4.0.6 and it actually does work here on the sam9x60 curiosity (also timing mode 3). I did not check timings in detail for that timing mode 3 which is chosen.

LeSpocky commented 8 months ago

According to the datasheet the S34ML02G1 does not support the CMD_GET_FEATURE (EEh) send in function nand_get_feature_timing_mode(). At least that command is not listed in the "command set" section of the datasheet. Quote:

Open NAND Flash Interface (ONFI) 1.0 compliant

All ONFI spec versions from 1.0 to 5.1 I looked at list the command 0xEE as optional.

LiBinSHA commented 8 months ago

Yes, I confirm this issue. I will add timing mode support for these kind of NAND flash in next release.

LeSpocky commented 8 months ago

After investigating timing issues more deeply in U-Boot and having another look at at91bootstrap, I came to the conclusion at91bootstrap does nothing wrong here.

It seems to be a problem with the specific NAND flash and/or our board layout in combination with SAM9X60. The slower timing modes 0 to 2 work fine in U-Boot, but mode 3 also fails. Symptom is always ECC errors.

(The same flash chip works fine when used with SAMA5D2 or SAM9G20 on other boards, which have a lower rate for MCK, and thus use slightly different timings in the end.)

U-Boot binary is currently 676772 bytes here. In mode 0 we can read with 5.0 MiB/s which would take us ~130 ms to read U-Boot binary. In mode 3 with roughly 9.7 MiB/s it would take ~67 ms to load U-Boot. I'm not optimizing for 70 ms boot time here, so I will leave CONFIG_NAND_TIMING_MODE disabled for now.

If you still want to look into this, maybe some idea how to support other modes would be nice, for example if I could override the mode from Kconfig?

LiBinSHA commented 8 months ago

Some NAND flash (S34ML01G2 and W29N02KVxxAF) do not work properly in Timing Mode 3, since their maximum tREA time is 4ns longer then normal NAND flash. The workaround is to extend the SMC NRD pulse to meet tREA timing.

LiBinSHA commented 8 months ago

Please try this patch. 0001-driver-nandflash-update-nand-smc-timing.patch

LeSpocky commented 8 months ago

Some NAND flash (S34ML01G2 and W29N02KVxxAF) do not work properly in Timing Mode 3, since their maximum tREA time is 4ns longer then normal NAND flash. The workaround is to extend the SMC NRD pulse to meet tREA timing.

The S34ML01G2 you mentioned is not the same chip as the S34ML02G1 we use. I studied the datasheet of both of them as well as the ONFI Spec Revision 4.2 again. The timings are the same and also mostly the same as in ONFI spec mode 3, especially tREA is listed as 20 ns in all three documents. So I'm not sure what you mean with "4ns longer" and "normal NAND flash"?

Please try this patch. 0001-driver-nandflash-update-nand-smc-timing.patch

With that patch it works, see my debug output:

--- at91bootstrap-nand-before.log       2024-02-29 14:39:48.270341137 +0100
+++ at91bootstrap-nand-after.log        2024-02-29 14:42:50.914346614 +0100
@@ -7,12 +7,11 @@
 NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
 NAND: mode: 3, cs: 3, mck_ps: 5000 (5 ns), tdf: 15 (75 ns)
 NAND: NWE: setup: 2 (10 ns), pulse: 3 (15 ns), hold: 2 (10 ns), cycle: 7 (35 ns)
-NAND: NRD: setup: 0 (0 ns), pulse: 3 (15 ns), hold: 3 (15 ns), cycle: 6 (30 ns)
+NAND: NRD: setup: 0 (0 ns), pulse: 4 (20 ns), hold: 3 (15 ns), cycle: 7 (35 ns)
 NAND: Switch to timing mode 3
 NAND: Disable On-Die ECC
 PMECC: version is: 0x102
 PMECC: page_size: 2048, oob_size: 64, pmecc_cap: 8, sector_size: 512
 NAND: Initialize PMECC params, cap: 8, sector: 512
 NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
-PMECC: sector bits = 15, bit 1 means corrupted sector, Now correcting...
-PMECC: failed to correct corrupted bits!
+NAND: Done to load image

So you might add this to your patch:

Tested-by: Alexander Dahl <ada@thorsis.com>

I used another patch to create that debug output, but GitHub does not allow me to attach it here. Should I make a PR for that?

Bonus question: I made the same change proposed by your patch in U-Boot and now I can successfully read from the flash in U-Boot, too. When applied the transfer rate is somewhat lower than on the sam9x60 curiosity board with 3 pulse cycles instead of 4.

If I would propose the change to U-Boot developers, should that workaround go into the generic atmel raw nand driver or should it be made a quirk based on the nand chip instead of the nand controller? (Same question would apply to Linux, but I did not test the fix in Linux yet.)

LeSpocky commented 8 months ago

One further thing: the SAM9X60-Curiosity User's Guide mentions this for the NAND Flash:

Matched Net Lengths [Tolerance = 0.5mm]

On our prototype board these lengths are not matched. Can that be another reason for the flash not working with the previous timings?

LiBinSHA commented 7 months ago

The patch I provided is based on bitbucket, not github, sorry about that. The timing mode 3 issue can also be reproduced on linux. I will apply this patch to bootstrap.

LeSpocky commented 6 months ago

Please try this patch. 0001-driver-nandflash-update-nand-smc-timing.patch

Changeset e2dfd8141d00613a37acee66ef5724f70f34a538 hit master. Extends the patch proposed here with another check to only apply it for TIMING_MODE_3.

Meanwhile I could verify the same approach works on U-Boot and Linux.

Did not test v4.0.9-rc1 on real hardware yet though, will do that later.

LeSpocky commented 6 months ago

Did not test v4.0.9-rc1 on real hardware yet though, will do that later.

Works for me:

AT91Bootstrap 4.0.9-rc1 (2020-08-01 00:00:00)

NAND: ONFI flash detected
NAND: Manufacturer ID: 0x1 Chip ID: 0xda
NAND: Page Bytes: 2048, Spare Bytes: 64
NAND: ECC Correctability Bits: 1, ECC Sector Bytes: 512
NAND: Switch to timing mode 3
NAND: Disable On-Die ECC
NAND: Initialize PMECC params, cap: 8, sector: 512
NAND: Image: Copy 0xc0000 bytes from 0x40000 to 0x21f00000
NAND: Done to load image