timvideos / litex-buildenv

An environment for building LiteX based FPGA designs. Makes it easy to get everything you need!
BSD 2-Clause "Simplified" License
214 stars 79 forks source link

make gateware-flash "FAILED!" on TinyFPGA BX #137

Open zrecore opened 5 years ago

zrecore commented 5 years ago

make_gateware_output.txt make_gateware-flash_output.txt

It seems the final write doesn't actually write all bytes, as read comes back missing some final bytes.

ewenmcneill commented 5 years ago

FWIW, my theory is that it actually does write all the bytes, but it doesn't read all the bytes, due to the way that read() works on linux/unix, and this (combined with aggressively short timeouts), is the cause of a bunch of related issues. I added a bit more analysis at https://github.com/tinyfpga/TinyFPGA-Bootloader/pull/49#issuecomment-493381369 and https://github.com/tinyfpga/TinyFPGA-Bootloader/pull/49#issuecomment-493392040.

In your results, it looks like what is read back is offset by 32 bytes, compared with what was written...

... but maybe there is also a problem with short writes too.

Ewen

zrecore commented 5 years ago
Flash image: build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
ff 00 00 ff 7e aa 99 7e 51 00 01 05 92 00 21 62 03 67 72 01 10 82 00 00 11 00 01 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
make[1]: Leaving directory '/home/aalbino/litex-buildenv'
tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
    Programming at addr 028000
    Waking up SPI flash
    180192 bytes to program
    Programming and Verifying: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 180k/180k [00:06<00:00, 9.81kB/s]
Offset: 343808
Readback Data:
10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
4c 3e 00 00 20 05 36 58 20 05 3f cc 6b 0e 14 bd
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
Write Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
4c 3e 00 00 20 05 36 58 20 05 3f cc 6b 0e 14 bd
    FAILED!

make: *** [targets/tinyfpga_bx/Makefile.mk:27: gateware-flash-tinyfpga_bx] Error 1

export FIRMWARE=stub; make image-flash works successfully. export FIRMWARE=micropython; make image-flash works successfully too.

It seems to be some issue with the gateware without the firmware that is causing issue...

zrecore commented 5 years ago

There is some type of weird wrapping going on here?

Readback Data:
1 - 10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
2 - 4c 3e 00 00 20 05 36 58 20 05 3f cc 6b 0e 14 bd
3 - ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
4 - a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
Write Data:
3 - ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
4 - a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
1 - 10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
2 - 4c 3e 00 00 20 05 36 58 20 05 3f cc 6b 0e 14 bd
    FAILED!
ewenmcneill commented 5 years ago

@zrecore We've previously found (at least in older) tinyprog versions, that there were issues with programming files of a specific size, and from memory the default firmware (HDMI2USB basic firmware) was one that was a file that triggered that behaviour.

Of note,

(1024*176)-180192 = 32

Ie, the file it's writing out is just short of 176KiB, and that's itself just short of 44 * 4096 KiB blocks. So I wonder if something about the file length is provoking this problem (eg, that files that are a multiple of some internal block work better).

It's definitely interesting that there's a "wrap around" pattern in your output. That might possibly be caused by repeated byte sequences in the file to be written, or data read back out of flash from a previous attempt that succeeded/failed). The offset where it's failling (Offset: 343808) feels to me like it's likely to be a fair way into the file being written (maybe towards the end?).

Ewen

ewenmcneill commented 5 years ago

Looks like that's reproducible -- the upload image for FIRMWARE=none reliably fails, and seems to reliably fail at exactly the same position, if it doesn't fail earlier due to a write timeout. So I think:

We probably need to fix both those issues, and then see what else is left as issues (I think there's a read timeout issue as well as a write timeout, so I think we probably have to fix both of those).

Ewen

Testing here:

  1. make clean; export FIRMWARE=stub; make image-flash gives:
tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+stub.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+stub.bin
    Programming at addr 028000
    Waking up SPI flash
    203904 bytes to program
    Programming and Verifying: 100%|█████████| 204k/204k [00:05<00:00, 39.6kB/s]
    Success!
        __   _ __      _  __
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
 SoC BIOS / CPU: LM32 /  16MHz
(c) Copyright 2012-2018 Enjoy-Digital
(c) Copyright 2007-2018 M-Labs Limited
Built May 20 2019 15:36:36

BIOS CRC passed (650092ce)
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Timeout
Booting from flash...
Executing booted program at 0x20058008
Stub firmware booting...

LiteX-Buildenv Stub Firmware, built May 20 2019 15:36:37

Type "help" for help

STUB>
  1. make clean; export FIRMWARE=micropython; make gateware; scripts/build-micropython.sh; make image-flash
tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+micropython.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+micropython.bin
    Programming at addr 028000
    Waking up SPI flash
    442884 bytes to program
    Programming and Verifying: 100%|█████████| 443k/443k [00:11<00:00, 40.2kB/s]
    Success!
       / /  (_) /____ | |/_/
      / /__/ / __/ -_)>  <
     /____/_/\__/\__/_/|_|
 SoC BIOS / CPU: LM32 /  16MHz
(c) Copyright 2012-2018 Enjoy-Digital
(c) Copyright 2007-2018 M-Labs Limited
Built May 20 2019 15:41:15

BIOS CRC passed (d0aa97ab)
Booting from serial...
Press Q or ESC to abort boot completely.
sL5DdSMmkekro
Timeout
Booting from flash...
Executing booted program at 0x20058008
MicroPython v1.9.4-534-gd2bd404 on 2019-05-20; litex with lm32
>>> 
  1. make clean; export FIRMWARE=none; make image-flash
tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
    Programming at addr 028000
    Waking up SPI flash
    180192 bytes to program
    Programming and Verifying: 100%|████████▉| 179k/180k [00:04<00:00, 41.0kB/s]
Offset: 343808
Readback Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
Write Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
4c 3e 00 00 20 05 36 58 20 05 3f cc 5d fe 0e 2b
    FAILED!
    Programming and Verifying: 100%|█████████| 180k/180k [00:05<00:00, 32.8kB/s]
targets/tinyfpga_bx/Makefile.mk:31: recipe for target 'image-flash-tinyfpga_bx' failed
make: *** [image-flash-tinyfpga_bx] Error 1
Makefile:211: warning: overriding recipe for target 'build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin'
targets/tinyfpga_bx/Makefile.mk:52: warning: ignoring old recipe for target 'build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin'
(LX P=tinyfpga_bx.minimal F=none R=default-target-tinyfpga-bx) ewen@parthenon:/s
rc/fpga/litex-buildenv$ 
tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
    Programming at addr 028000
    Waking up SPI flash
    180192 bytes to program
    Programming and Verifying:  21%|█▋      | 37.9k/180k [00:01<00:07, 19.0kB/s]
Failed to write to serial port:
Write timeout
targets/tinyfpga_bx/Makefile.mk:31: recipe for target 'image-flash-tinyfpga_bx' failed
make: *** [image-flash-tinyfpga_bx] Error 1
Makefile:211: warning: overriding recipe for target 'build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin'
targets/tinyfpga_bx/Makefile.mk:52: warning: ignoring old recipe for target 'build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin'
(LX P=tinyfpga_bx.minimal F=none R=default-target-tinyfpga-bx) ewen@parthenon:/s
rc/fpga/litex-buildenv$ 
tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
    Programming at addr 028000
    Waking up SPI flash
    180192 bytes to program
    Programming and Verifying: 100%|█████████| 180k/180k [00:05<00:00, 9.90kB/s]
Offset: 343808
Readback Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
Write Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
4c 3e 00 00 20 05 36 58 20 05 3f cc c3 22 55 69
    FAILED!

targets/tinyfpga_bx/Makefile.mk:31: recipe for target 'image-flash-tinyfpga_bx' failed
make: *** [image-flash-tinyfpga_bx] Error 1
Makefile:211: warning: overriding recipe for target 'build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin'
targets/tinyfpga_bx/Makefile.mk:52: warning: ignoring old recipe for target 'build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin'
(LX P=tinyfpga_bx.minimal F=none R=default-target-tinyfpga-bx) ewen@parthenon:/s
rc/fpga/litex-buildenv$ 
mithro commented 5 years ago

FYI - The above messages from @zrecore were actually from me debugging on @zrecore's laptop.

This behaviour seems to occur with both Python 2 and Python 3. It also seems to occur with the version uploaded to pip and the version on GitHub.

ewenmcneill commented 5 years ago

@mithro

FYI - The above messages from @zrecore were actually from me debugging on @zrecore's laptop.

I did wonder about the timing when I later saw a tweet about you working on litex-buildenv for someone at MakerFaire :-)

To the best of my recollection FIRMWARE=none has never worked for me / anyone that I know, on a TinyFPGA BX. So my guess is that there's something wrong with the address calculation logic when tinyprog is forced to write fractions of a sector (there's some non-trivial logic needed to do that, and I do remember other people talking about other similar problems in the past). Writes that are full sectors (and here from the code I read last night I think "sector" == 4 * 1024 == 4096 bytes) are both handled by a different programming opcode and I think work reliably (give or take timeouts). I remember there were some changes to attempt to fix those partial sector write issues, but from memory they should have been in 1.0.23 (on PyPI, which was where I pulled the diff for what we put into GitHub, as TinyFPGA hadn't uploaded the relevant code to GitHub).

Also from the code reading last night, very early tinyprog had 0.2 second timeouts, and then about a year ago that was changed to 1.0 second timeouts (presumably due to hitting too many timeouts). Of note, it looks like tinyprog implicitly opens the serial port at 9600bps (https://github.com/tinyfpga/TinyFPGA-Bootloader/blob/6e312535cba4fc0216a59b1b9af651ba45fcc1d3/programmer/tinyprog/__init__.py#L107-L120; due to that being the default -- https://github.com/pyserial/pyserial/blob/acab9d2c0efb63323faebfd5e3405d77cd4b5617/serial/serialutil.py#L163-L192). In the case of USB serial, I'm unsure how strongly that speed limiting is enforced, but if it is really 9600bps then data will take a non-trivial amount of time to send simply due to the baud rate (approximately 1KiB/s).

FTR, I emailed TinyFPGA yesterday to ask about the choice of 1.0s timeouts. I've not heard back yet, but given they were originally 0.2s early on, it feels like 1.0s was just "5 times larger" / "mostly don't hit timeouts now". Rather than any specific reason for 1.0s as a short timeout. (My current hunch is 5.0s for read timeout, and 10.0s for write timeout would make more sense.)

Ewen

GitHub
tinyfpga/TinyFPGA-Bootloader
An open source USB bootloader for FPGAs. Contribute to tinyfpga/TinyFPGA-Bootloader development by creating an account on GitHub.
GitHub
pyserial/pyserial
Python serial port access library. Contribute to pyserial/pyserial development by creating an account on GitHub.
mithro commented 5 years ago

I need to dig out my tinyfpga-bx from a box somewhere to test myself. Probably be a couple of days before that happens...

ewenmcneill commented 5 years ago

After some experimentation, I'm pretty much convinced that there's an issue with writing partial "minor sectors". If I detect the minor sector, append enough 0x00 bytes to fill out the "minor sector", and then write/read, the write of the FIRMWARE=none file (32 bytes short of a 256 byte multiple) succeeds just fine.

So that confirms my hunch that it's failing on the very last write (if it doesn't fail due to an earlier timeout, which still happens about 1 time in 10 for me; I've not yet adjusted the timings).

Of note, the way that these writes are being done is in program_sectors, which calls self.write and self.read in 256 byte chunks (minor_sector_size). In particular this means that the looping handling in self.write() and self.read() is pretty much ignored, as it's only ever being asked to do 256 bytes at a time.

I've not yet traced the program flow enough to understand where there's an issue with writing shorter sectors. Based on your trace it almost looks like it gets the address wrong or something like that.

Also of note, the progress bar on programming for me seems to proceed at about 35KB/s, which implies that in practice it's not limited to 9600bps :-)

Ewen

PS: For reasons I don't understand max_length on the self.write call is 256, and max_length on the self.read call is 255. Which is a weird off by one behaviour. I've not examined self.write and self.read closely enough to understand the impact of that.

tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
    Programming at addr 028000
    Waking up SPI flash
    180192 bytes to program
    Programming and Verifying: 180kB [00:04, 39.6kB/s]                          
    Success!

(LX P=tinyfpga_bx.minimal F=none R=default-target-tinyfpga-bx) ewen@parthenon:/src/fpga/litex-buildenv$ 
(LX P=tinyfpga_bx.minimal F=micropython R=default-target-tinyfpga-bx) ewen@parthenon:/src/fpga/litex-buildenv$ tail -20 /tmp/ftrace.log 
read sector at 165376 len=256
write sector at 165632 len=256
read sector at 165632 len=256
write sector at 165888 len=256
read sector at 165888 len=256
write sector at 166144 len=256
read sector at 166144 len=256
write sector at 166400 len=256
read sector at 166400 len=256
write sector at 166656 len=256
read sector at 166656 len=256
write sector at 166912 len=256
read sector at 166912 len=256
write sector at 167168 len=256
read sector at 167168 len=256
write sector at 167424 len=256
read sector at 167424 len=256
short minor sector, padding by 32 octets
write sector at 167680 len=256
read sector at 167680 len=256
(LX P=tinyfpga_bx.minimal F=micropython R=default-target-tinyfpga-bx) ewen@parthenon:/src/fpga/litex-buildenv$ 
               minor_sector_size = 256
                for minor_offset in range(0, 4 * 1024, minor_sector_size):
                    minor_write_data = current_write_data[
                        minor_offset:minor_offset + minor_sector_size]
                    if (len(minor_write_data) < minor_sector_size):
                         # Short sector, pad the end
                         pad_len = minor_sector_size - len(minor_write_data)
                         padding = b'\x00' * pad_len

                         trace("short minor sector, padding by {0} octets".format(pad_len))
                         minor_write_data = bytearray(minor_write_data)
                         minor_write_data.extend(padding)
                         assert(len(minor_write_data) == minor_sector_size)

                    trace("write sector at {0} len={1}".format(addr+minor_offset, len(minor_write_data)))

where that trace() routine is a hack I added to my local copy to get some trace-to-file-with-arguments functionality to be able to debug this:

#vvv -- call tracing
do_trace = True
TRACE_FILE=None

if do_trace:
    TRACE_FILE=open('/tmp/ftrace.log', 'w')

def trace(message):
    global TRACE_FILE
    if TRACE_FILE:
        print(message, file=TRACE_FILE)
#^^^
ewenmcneill commented 5 years ago

@mithro, It turns out that self.cmd is basically just serialising raw SPI commands over the serial interface, in a byte encoded format, which means in self._write the 0x02 command is actually a literal SPI flash command -- pages 14-15 of the SPI flash datasheet (from the link in the TinyFPGA Bootloader README).

The relevant SPI flash command command (0x02) says:

The Byte/Page Program command allows anywhere from a single byte of data to 256 bytes of data to be programmed into previously erased memory locations.

which means we inherit all of the quirks of writing the page from the SPI flash, including:

If the starting memory address denoted by A23-A0 does not fall on an even 256-byte page boundary (A7-A0 are not all 0), then special circumstances regarding which memory locations to be programmed will apply. In this situation, any data that is sent to the device that goes beyond the end of the page will wrap around back to the beginning of the same page.

So that exactly explains the "wrap around" behaviour that you were seeing: if we give the flash an address which is a partial sector address, offset into the sector, the SPI flash chip itself will wrap it around the end of the sector to the start.

From that I conclude that somewhere in the partial sector write case we're ending up passing the wrong address to the flash chip. I'm unclear if that's in the tinyprog python code, or in the Verilog in the bootloader. It might possibly be better just to implement a variation of my "pad the sector" hack in the Python code and skip trying to debug precisely where the partial sector writes go wrong...

Ewen

ewenmcneill commented 5 years ago

FTR, in the failing mode, this is what self.cmd is generating to send to the bootloader for the final write / read commands:

(LX P=tinyfpga_bx.minimal F=micropython R=default-target-tinyfpga-bx) ewen@parthenon:/src/fpga/litex-buildenv$ tail -20 /tmp/ftrace.log | grep -B 99 read
write sector at 343808 len=224
CMD 2 at 343808 (len=224)
01 e4 00 00 00 02 05 3f 00 ae d1 6a 4a d9 d6 5a
dc 40 df 0b 66 37 d8 3b f0 a9 bc ae 53 de bb 9e
c5 47 b2 cf 7f 30 b5 ff e9 bd bd f2 1c ca ba c2
8a 53 b3 93 30 24 b4 a3 a6 ba d0 36 05 cd d7 06
93 54 de 57 29 23 d9 67 bf b3 66 7a 2e c4 61 4a
b8 5d 68 1b 02 2a 6f 2b 94 b4 0b be 37 c3 0c 8e
a1 5a 05 df 1b 2d 02 ef 8d 10 00 00 10 10 00 00
0c 10 00 00 08 10 00 00 10 10 00 00 08 10 00 00
0c 10 00 00 08 10 00 00 a0 10 00 00 9c 10 00 00
a4 10 00 00 14 10 00 00 18 10 00 00 1c 10 00 00
9c 10 00 00 a0 10 00 00 a4 10 00 00 a0 10 00 00
9c 10 00 00 18 10 00 00 14 10 00 00 1c 10 00 00
a0 10 00 00 9c 10 00 00 18 10 00 00 14 10 00 00
18 10 00 00 14 3c 4e 55 4c 4c 3e 00 00 20 05 36
58 20 05 3f cc 41 3b f5 48
read sector at 343808 len=224
(LX P=tinyfpga_bx.minimal F=micropython R=default-target-tinyfpga-bx) ewen@parthenon:/src/fpga/litex-buildenv$ 

as dumped with this kludge, plus some of my other kludges (but the sector padding disabled):

        input_addr = addr
        ...
        if input_addr and len(data) > 0:
            trace("CMD {0} at {1} (len={2})".format(opcode, input_addr, len(data)))
            trace(pretty_hex(cmd_write_string))

injected into self.cmd (with the first bit to save the raw input argument, because the code manages to replace its input argument with a 24-bit big-endian packed version of the input address).

The corresponding write failure for that run is:

tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
    Programming at addr 028000
    Waking up SPI flash
    180192 bytes to program
    Programming and Verifying: 100%|█████████| 180k/180k [00:06<00:00, 9.81kB/s]
Offset: 343808
Readback Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
Write Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
4c 3e 00 00 20 05 36 58 20 05 3f cc 41 3b f5 48
    FAILED!

targets/tinyfpga_bx/Makefile.mk:31: recipe for target 'image-flash-tinyfpga_bx' failed
make: *** [image-flash-tinyfpga_bx] Error 1
(LX P=tinyfpga_bx.minimal F=none R=default-target-tinyfpga-bx) ewen@parthenon:/s
rc/fpga/litex-buildenv$ 

in case we want to try to align individual bits of the hex dumps.

I started trying to read the verilog which handles the USB serial to SPI bridge, but it's a twisty maze of nested state machines, with no documentation, so it's a bit hard to be certain what's happening at any given point from a casual glance.

Given it's trivial to pad to a full sector in Python (and I discovered we should pad to the "erased" value -- ie, 0xff -- rather than 0x00, because of how flash works), I'm strongly leaning towards just doing that pad-to-256-octets in Python and ignoring where it goes wrong as anything other than an academic exercise....

Ewen

ewenmcneill commented 5 years ago

In this write command (from above):

write sector at 343808 len=224
CMD 2 at 343808 (len=224)
01 e4 00 00 00 02 05 3f 00 ae d1 6a 4a d9 d6 5a
dc 40 df 0b 66 37 d8 3b f0 a9 bc ae 53 de bb 9e
c5 47 b2 cf 7f 30 b5 ff e9 bd bd f2 1c ca ba c2
8a 53 b3 93 30 24 b4 a3 a6 ba d0 36 05 cd d7 06
93 54 de 57 29 23 d9 67 bf b3 66 7a 2e c4 61 4a
b8 5d 68 1b 02 2a 6f 2b 94 b4 0b be 37 c3 0c 8e
a1 5a 05 df 1b 2d 02 ef 8d 10 00 00 10 10 00 00
0c 10 00 00 08 10 00 00 10 10 00 00 08 10 00 00
0c 10 00 00 08 10 00 00 a0 10 00 00 9c 10 00 00
a4 10 00 00 14 10 00 00 18 10 00 00 1c 10 00 00
9c 10 00 00 a0 10 00 00 a4 10 00 00 a0 10 00 00
9c 10 00 00 18 10 00 00 14 10 00 00 1c 10 00 00
a0 10 00 00 9c 10 00 00 18 10 00 00 14 10 00 00
18 10 00 00 14 3c 4e 55 4c 4c 3e 00 00 20 05 36
58 20 05 3f cc 41 3b f5 48

and the remainder of the sector is the data to write, here ae d1 6a 4a ... 41 3b f5 48, which appears to match what is reported for the write data in the tinyprog failure output.

So I think the write command itself, as issued by tinyprog itself, is valid. Which makes me think the problem occurs within the firmware / SPI flash.

Of note, the read report in that request above gave 12 16 octets back (192) rather than 14 16 octets back (224), which implies maybe we got a short read. But at a glance the first 192 octets did match; it was just short in that case. Versus some of your other cases where it was offset. So maybe we're triggering edge behaviour in either the TinyFPGA bootloader or the SPI flash with partial sectors.

Ewen

PS: If you want to play with it, https://github.com/ewenmcneill/TinyFPGA-Bootloader/tree/partial-sector-write-debug-hacks has a push of this kludgy debugging code as I was testing with it. In program_sectors there's a if (False and ...) which is turning off the padding work around, so you can try turning that off/on if you want.

GitHub
ewenmcneill/TinyFPGA-Bootloader
An open source USB bootloader for FPGAs. Contribute to ewenmcneill/TinyFPGA-Bootloader development by creating an account on GitHub.
zrecore commented 5 years ago

Trying to figure out what I can do to contribute a fix. Fiddling around w/ the ewencmneill/TinyFPGA-Bootloader version of tinyprog...

make[1]: Leaving directory '/home/aalbino/litex-buildenv'
tinyprog --program-image build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin

    TinyProg CLI
    ------------
    Using device id 1d50:6130
    Only one board with active bootloader, using it.

    Programming /dev/ttyACM0 with build/tinyfpga_bx_base_lm32.minimal//image-gateware+bios+none.bin
    Programming at addr 028000
    Waking up SPI flash
    180192 bytes to program
    Programming and Verifying: 100%|██████████████████████████████████████████████████| 180k/180k [00:06<00:00, 9.64kB/s]
Offset: 343808
Readback Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
Write Data:
ae d1 6a 4a d9 d6 5a dc 40 df 0b 66 37 d8 3b f0
a9 bc ae 53 de bb 9e c5 47 b2 cf 7f 30 b5 ff e9
bd bd f2 1c ca ba c2 8a 53 b3 93 30 24 b4 a3 a6
ba d0 36 05 cd d7 06 93 54 de 57 29 23 d9 67 bf
b3 66 7a 2e c4 61 4a b8 5d 68 1b 02 2a 6f 2b 94
b4 0b be 37 c3 0c 8e a1 5a 05 df 1b 2d 02 ef 8d
10 00 00 10 10 00 00 0c 10 00 00 08 10 00 00 10
10 00 00 08 10 00 00 0c 10 00 00 08 10 00 00 a0
10 00 00 9c 10 00 00 a4 10 00 00 14 10 00 00 18
10 00 00 1c 10 00 00 9c 10 00 00 a0 10 00 00 a4
10 00 00 a0 10 00 00 9c 10 00 00 18 10 00 00 14
10 00 00 1c 10 00 00 a0 10 00 00 9c 10 00 00 18
10 00 00 14 10 00 00 18 10 00 00 14 3c 4e 55 4c
4c 3e 00 00 20 05 36 58 20 05 3f cc 48 00 5b a7
    FAILED!

make: *** [targets/tinyfpga_bx/Makefile.mk:27: gateware-flash-tinyfpga_bx] Error 1
ewenmcneill commented 5 years ago

@zrecore Your results look like you're getting a short read back, at the end. Interestingly it seems to be on the final sector too. I didn't do anything about changing the timeouts in my experiments (on the partial-sector-write-debug-hacks branch), and because I was hoping to hear back from TinyFPGA in reply to my email about the timeout choice.

I never did hear anything back in response to my email, and from what I could see they were originally 0.2s/0.2s read/write timeouts, which is painfully too short, and became 1s/1s read/write timeouts just to make them longer. So I think we could bump those timeouts up a bit more without breaking anything.

You might want to try playing with the serial timeouts, and see if that makes a difference. My gut feeling is 2s read timeout, 5s write timeout make sense (longer for write, as a write timeout causing a partial write is more fatal).

I've been meaning to turn that experiment branch into an actual cleaned up pull request (to always pad to a full sector/write a full sector, since there seems no reason not to do so -- the whole sector is always erased above, and would be erased prior to any subsequent rewrite). And probably also increase the timeouts too. But life has intervened the last couple of weeks, and I've not even plugged my TinyFPGA BX in at all... maybe in the next week.

Making an actual pull request/merging it, would potentially allow us to do an actual release with a fix (mithro's time permitting; IIRC he's at a conference at present). And that'd also make it easier to test if it helped.

Ewen

ewenmcneill commented 5 years ago

I made an actual pull request with a cleaned up version (and longer timeouts):

https://github.com/tinyfpga/TinyFPGA-Bootloader/pull/52

It seems to work for me (the pull request has a bunch of test results).

@zrecore FYI, I noticed when I was doing that that the previous kludge branch I'd pushed had an "if (False and ...) condition on the sector padding, so if you didn't spot that in your testing you were probably testing with the fix behaviour disabled. IIRC I did that so I could experiment with whether the sector padding was what was making the difference or just some timing. Apologies if that caught you out. (The pull request has the padding always enabled because AFAICT there's no reason not to do the padding, given the sector was just erased anyway -- there can't be "good data" after the bit we're writing in the sector :-) But that was how I noticed I'd if (False ...) out the work around....)

Ewen