FPGA support - Githubissues

cmdada commented 4 months ago

Theres both the xilinx Zynq soc with it's integrated FPGA (usually referred to as "the" rio FPGA), and the Lattice MachXO2-640. Neither of these have easy qemu support but i found https://www.xilinx.com/video/soc/introduction-to-qemu.html and https://elinux.org/images/9/95/Jw-ei-elc2010-final.pdf

cmdada commented 4 months ago

MachXO2-640 has no PLL and two EBR blocks, 5 kbits of distributed ram, 18 kbits of EBR SRAM, 24 kbits ufm

Slice Signal Descriptions Function Type Signal Names Description Input Data signal A0, B0, C0, D0 Inputs to LUT4 Input Data signal A1, B1, C1, D1 Inputs to LUT4 Input Multi-purpose M0/M1 Multi-purpose input Input Control signal CE Clock enable Input Control signal LSR Local set/reset Input Control signal CLK System clock Input Inter-PFU signal FCIN Fast carry in 1 Output Data signals F0, F1 LUT4 output register bypass signals Output Data signals Q0, Q1 Register outputs Output Data signals OFX0 Output of a LUT5 MUX Output Data signals OFX1 Output of a LUT6, LUT7, LUT8 2 MUX depending on the slice Output Inter-PFU signal FCO Fast carry out

Modes of Operation: Each slice has up to four potential modes of operation: Logic, Ripple, RAM and ROM. Logic Mode In this mode, the LUTs in each slice are configured as 4-input combinatorial lookup tables. A LUT4 can have 16 possible input combinations. Any four input logic functions can be generated by programming this lookup table. Since there are two LUT4s per slice, a LUT5 can be constructed within one slice. Larger look-up tables such as LUT6, LUT7 and LUT8 can be constructed by concatenating other slices. Note LUT8 requires more than four slices. Ripple Mode Ripple mode supports the efficient implementation of small arithmetic functions. In Ripple mode, the following func- tions can be implemented by each slice: • Addition 2-bit • Subtraction 2-bit • Add/subtract 2-bit using dynamic control • Up counter 2-bit • Down counter 2-bit • Up/down counter with asynchronous clear • Up/down counter with preload (sync) • Ripple mode multiplier building block • Multiplier support • Comparator functions of A and B inputs — A greater-than-or-equal-to B — A not-equal-to B — A less-than-or-equal-to B Ripple mode includes an optional configuration that performs arithmetic using fast carry chain methods. In this con- figuration (also referred to as CCU2 mode) two additional signals, Carry Generate and Carry Propagate, are gener- ated on a per-slice basis to allow fast arithmetic functions to be constructed by concatenating slices. RAM Mode In this mode, a 16x4-bit distributed single port RAM (SPR) can be constructed by using each LUT block in Slice 0 and Slice 1 as a 16x1-bit memory. Slice 2 is used to provide memory address and control signals. MachXO2 devices support distributed memory initialization. The Lattice design tools support the creation of a variety of different size memories. Where appropriate, the soft- ware will construct these using distributed memory primitives that represent the capabilities of the PFU. Table 2-3 shows the number of slices required to implement different distributed RAM primitives. For more information about using RAM in MachXO2 devices, please see TN1201, Memory Usage Guide for MachXO2 Devices. Table 2-3. Number of Slices Required For Implementing Distributed RAM SPR 16x4 PDPR 16x4 Number of slices 3 3 Note: SPR = Single Port RAM, PDPR = Pseudo Dual Port RAM

Each MachXO2 device has eight clock inputs (PCLK [T, C] [Banknum]_[2..0]) – three pins on the left side, two pins each on the bottom and top sides and one pin on the right side. These clock inputs drive the clock nets. These eight inputs can be differential or single-ended and may be used as general purpose I/O if they are not used to drive the clock nets. When using a single ended clock input, only the PCLKT input can drive the clock tree directly. The MachXO2 architecture has three types of clocking resources: edge clocks, primary clocks and secondary high fanout nets. MachXO2-640U, MachXO2-1200/U and higher density devices have two edge clocks each on the top and bottom edges. Lower density devices have no edge clocks. Edge clocks are used to clock I/O registers and have low injection time and skew. Edge clock inputs are from PLL outputs, primary clock pads, edge clock bridge outputs and CIB sources. The eight primary clock lines in the primary clock network drive throughout the entire device and can provide clocks for all resources within the device including PFUs, EBRs and PICs. In addition to the primary clock signals, MachXO2 devices also have eight secondary high fanout signals which can be used for global control signals, such as clock enables, synchronous or asynchronous clears, presets, output enables, etc. Internal logic can drive the global clock network for internally-generated global clocks and control signals. The maximum frequency for the primary clock network is shown in the MachXO2 External Switching Characteris- tics table. The primary clock signals for the MachXO2-256 and MachXO2-640 are generated from eight 17:1 muxes The available clock sources include eight I/O sources and 9 routing inputs. Primary clock signals for the MachXO2- 640U, MachXO2-1200/U and larger devices are generated from eight 27:1 muxes The available clock sources include eight I/O sources, 11 routing inputs, eight clock divider inputs and up to eight sysCLOCK PLL outputs

cmdada commented 4 months ago

I think its gonna be a lot easier to fix everything else before this but yeah still wanted to put this all in one location

TotalTaxAmount commented 3 months ago

I agree we should fix stuff first, although we don't have to implement all the FPGA stuff in qemu. I think all we have to do is have a device that can pretend to be a FPGA and send back what the roboRIO expects for that watchdog thing

TotalTaxAmount / roborio-vm

FPGA support #3