im-tomu / foboot

Bootloader for Fomu
Apache License 2.0
100 stars 34 forks source link

Add support for ECP5 hardware #28

Open mithro opened 4 years ago

mithro commented 4 years ago

It would be good to support ECP5 hardware like the OrangeCrab from @GregDavill and potentially other ECP5 hardware in the future.

mithro commented 4 years ago

@gregdavill might already have this working...

gregdavill commented 4 years ago

Here is my almost working implementation. I still need to bring it upto speed with all the mainline charges: https://github.com/gregdavill/foboot

xobs commented 4 years ago

There needs to be a way to have eptri run at something other than a Wishbone frequency of 12 MHz. I'm not sure the best way to do that...

gregdavill commented 4 years ago

Initially we can operate the wishbone at 12MHz.

Feel free to assign this to me as I've got the OrangeCrab hardware to test on. I'll create a pull request to get comments once I've got it running.

gregdavill commented 4 years ago

I've got the main code base ported and it's "running" now, some small issues I'm still debugging.

Do you have a reference of the configuration options used to generate the "Fomu" vexriscv variant? Might aid with debugging.

xobs commented 4 years ago

The SPI CLK signal sounds like an ECP5 quirk. If you're using the standard ECP5 SPI pins, you'll need to do a special dance to get that working:

https://github.com/xobs/haddecks/blob/master/haddecks.py#L423-L425

            flash = SpiFlashDualQuad(platform.request("spiflash4x"), dummy=6, endianness="little")
            flash.add_clk_primitive(self.platform.device)

The vexriscv core was generated from https://github.com/xobs/VexRiscv-verilog. I'm not sure what could cause the issue, unless you're using the "linux" variant which throws out the CSR map and substitutes its own.

enjoy-digital commented 4 years ago

@gregdavill: as @xobs says, on ECP5, you need to instantiate a USRMCLK primitive to drive the SPIFlash clock, that's done by add_clk_primitive: https://github.com/enjoy-digital/litex/blob/master/litex/soc/cores/spi_flash.py#L65-L67.

gregdavill commented 4 years ago

Clock output using the USRMCLK primitive built into spi_flash.py is working well.

When switching to bit-banging mode the /WP and /HOLD are tri-stated, my hardware doesn't have any pull-up on these pins, which appears to be the main cause of my issues right now.

xobs commented 4 years ago

Those pins are only tri-stated when they're inputs. When they're outputs, they should reflect the value of MOSI. It's not ideal at all, but it shouldn't be an issue as long as you do one-bit writes and you have the QE bit set in your flash config.

gregdavill commented 4 years ago

I think my issue might be the QE bit then... I don't think that is set. So the FLASH is treating IO2/IO3 as HOLD and WP functions.

gregdavill commented 4 years ago

I'm still trying to understand why the vexriscv core you've configured doesn't like running on my ecp5 platform.

The only major change between the up5k and the ecp5 is the lack of SPRAM, so I've replaced it with a standard LiteX SRAM module.

I can swap the entire core for another variant from your repo and it works. i.e. VexRiscv_Fomu.v <= VexRiscv_HaD.v. But the Original VexRiscv_Fomu.v core seems to be hitting a wishbone error when accessing the CSRs, I'm sure it's not a issue with the CSR mappings, as it works when the CPU is swapped. I see you've enabled MMU on these cores, and tbh I'm not really sure how that works on VexRiscV

xobs commented 4 years ago

The MMU shouldn't be an issue as long as $satp is 0. Even so, I'm not sure if I'll keep it in future builds.

Are you running with the debug build enabled? That's undergone much more testing.

gregdavill commented 4 years ago

I've just tried switching back to the debug build, no difference.

It's weird because it works on the up5k Fomu hardware. Which suggests the issue comes from differences between icestorm/trellis.

I've just setup my enviroment to build VexRiscv from source, so I'll play around with some configurations and see if I can nail down specific settings that are causing issue.

gregdavill commented 4 years ago

I've performed a manual bisect of the various VexRiscv options, specifically --pipelining false is what's causing the grief.

I'm using https://github.com/xobs/VexRiscv-verilog, and running vexriscv.GenFomu script

If anyone is curious these are my sources: ext/SpinalHDL v1.3.6 git: 9bf01e7f ext/VexRiscv: b290b25f VexRiscv-verilog: b1f61b5a

For now I'll just use a custom Fomu VexRiscv with this option turned off, and continue the porting.

I need to setup the auto-reset, and more importantly ecp5 multiboot bitstreams.

gregdavill commented 4 years ago

I've got an initial port done. https://github.com/gregdavill/foboot/tree/OrangeCrab I still want to put it through it's paces more, seems a bit fragile to build sometimes.

I'm not sure on the best way to structure different platforms code differences. Right now I'm using lots of if platform.device[:4] != "LFE5": statements, which is not ideal.

xobs commented 4 years ago

I feel that a lot of the platform-specific stuff could be added to the Platform constructor. Having it add the crg, and perhaps populating a list of all of the platform-specific modules. Modules such as touch, rgb, and build commands.

I also feel the build commands should be upstreamed into litex-hub, though they are the sort of thing you might tune per-project.

For example, you can set self.crg = litex_boards.partner.targets.fomu._CRG for a Fomu board, or set self.crg = def _CRG() on orangecrab within Platform.__init__().

mithro commented 4 years ago

Maybe we can take inspiration from the linux-on-litex-vexriscv's approach;

gregdavill commented 4 years ago

@mithro This is the direction I'm thinking of moving.

I'd like a way to add some platform specific build commands. For example with ECP5 targets it might be useful to be able to specify 25,45,85k variants.

The OrangeCrab, and the TinyFPGA EX boards also don't have touch modules. So I'm thinking of defining a new module for button and then using macros to enable/disable firmware based on what modules are loaded on the LiteX side.

enjoy-digital commented 4 years ago

@gregdavill: not sure it's useful but on ULX3S we are supporting device variants with: https://github.com/litex-hub/litex-boards/blob/master/litex_boards/partner/platforms/ulx3s.py#L74 https://github.com/litex-hub/litex-boards/blob/master/litex_boards/partner/targets/ulx3s.py#L79-L80

gregdavill commented 4 years ago

Thanks for the examples! That is helpful.

I'm refactoring the BaseSoC to add different modules by calling add_[module] from the platform. https://github.com/gregdavill/foboot/blob/OrangeCrab/hw/foboot-bitstream.py#L202

I've also added some arg parser options to the platform, so after the platform is defined the arguments are parsed again. I'm not sure if this is the correct way of doing this, but does seem to work. This enables different platforms to define their own revisions.

gregdavill commented 4 years ago

@xobs, what are your thoughts on use of the USB VID:PID for various platforms, or should each platform have their own?

I'm specifically wondering if it's beneficial to use different strings for different hardware fomu / OrangeCrab in the device description.

Product: Fomu PVT running DFU Bootloader v1.9.1
Manufacturer: Foosn
Product: OrangeCrab r0.2 running DFU Bootloader v2.0.3-1-gc9571b5
Manufacturer: Foosn
xobs commented 4 years ago

I think the intention of VID:PID is to identify a particular device, so it would make most sense to have each platform use its own. Similarly, it would make the most sense to have your own Product, Manufacturer, and Serial strings. We already have "Fomu [PVT/EVT/DVT/Hacker]" strings, so it should be a simple extension of that.

gregdavill commented 4 years ago

I've requested a new code for the OrangeCrab (via pid.codes).

I've got the port to an "alpha" state. It's basic functions work.

The boot-loader is entered when the button is being held and a user plugs the device in. The ECP5 bitstream is setup to multi-boot into 0x180000 The reboot CSR asserts PROGRAMN, which activates this jump.

I've split all of the platform unique stuff into their own separate platform files. Fomu, and OrangeCrab.

So I can build an OrangeCrab version with: python3 foboot-bitstream.py --platform orangecrab --revision r0_2 --device 85F

And a Fomu Version like: python3 foboot-bitstream.py --platform fomu --revision pvt

I found the soft IP for the SBLED block that you showed me at 36c3 wasn't replicating the same effects as the hard IP, so I've just started creating a new module from scratch. It's not complete, but gets the point across for now.

xobs commented 4 years ago

That looks like a good approach. I look forward to being able to try it out.

xobs commented 4 years ago

Incidentally, and apropos of nothing, how does resource utilization change if you set the CSR width to 32 instead of the default of 8?

--- a/hw/foboot-bitstream.py
+++ b/hw/foboot-bitstream.py
@@ -136,7 +137,7 @@ class BaseSoC(SoCCore, AutoDoc):
         clk_freq = int(12e6)
         self.submodules.crg = _CRG(platform)

-        SoCCore.__init__(self, platform, clk_freq, integrated_sram_size=0, with_uart=False, **kwargs)
+        SoCCore.__init__(self, platform, clk_freq, integrated_sram_size=0, with_uart=False, csr_data_width=32, **kwargs)

         usb_debug = False
         if debug is not None:

We're going to move to 32-bit CSRs on Betrusted, and resource utilization is almost identical with Fomu with 32-bit CSRs, so I'm curious to see what it's like on ECP5.

gregdavill commented 4 years ago

8 bit

Info: Logic utilisation before packing:
Info:     Total LUT4s:      5681/24288    23%
Info:         logic LUTs:   4529/24288    18%
Info:         carry LUTs:    630/24288     2%
Info:           RAM LUTs:    348/12144     2%
Info:          RAMW LUTs:    174/ 6072     2%

Info:      Total DFFs:      4732/24288    19%
                  ---------
Info: Device utilisation:
Info:          TRELLIS_SLICE:  3866/12144    31%
Info:             TRELLIS_IO:    14/  196     7%
Info:                   DCCA:     2/   56     3%
Info:                 DP16KD:    17/   56    30%

32 bit

Info: Logic utilisation before packing:
Info:     Total LUT4s:      5351/24288    22%
Info:         logic LUTs:   4199/24288    17%
Info:         carry LUTs:    630/24288     2%
Info:           RAM LUTs:    348/12144     2%
Info:          RAMW LUTs:    174/ 6072     2%

Info:      Total DFFs:      4876/24288    20%
                  ---------
Info: Device utilisation:
Info:          TRELLIS_SLICE:  3723/12144    30%
Info:             TRELLIS_IO:    14/  196     7%
Info:                   DCCA:     2/   56     3%
Info:                 DP16KD:    17/   56    30%

Interesting, slightly more DFFs used, but less logic.

xobs commented 4 years ago

That's about what I expected. And does it change much when you mess around with --seed? In Fomu, they're basically indistinguishable.

gregdavill commented 4 years ago

I setup the --seed parameter for the OrangeCrab platform and just let my computer churn away for a bit. I only did 11 runs for each 8/32bit. It's basically all within the margin of error.

image

image

xobs commented 4 years ago

Interesting, thanks for doing that! That's really good data.

gregdavill commented 4 years ago

Just fixed another bug that was preset in the ECP5 port. I'd misunderstood the logic in reboot() so this wasn't working. I've just been cycling the USB port to exit the bootloader.

I'm using this snippet, which was borrowed from the hadbadge ipl, I'll extend this to support diamond bitstreams too.

const char magic[]="\xFF\x00Part: LFE5";
for (i = 0; i < (int)sizeof(magic) - 1; i++) {
    if(destination_array[i] == magic[i]) {
        riscv_boot = 0; // FLASH appears to be an ECP5 bitstream
    }else {
        riscv_boot = 1; // Assume it's RISCV code, and jump to it.
        break;
    }
}