trabucayre / openFPGALoader

Universal utility for programming FPGA
https://trabucayre.github.io/openFPGALoader/
Apache License 2.0
1.22k stars 263 forks source link

Altera::spi_put without calling shiftVIR #248

Open el-coder-sb opened 2 years ago

el-coder-sb commented 2 years ago

I noticed that IMHO there are unneccessary shiftVIR calls. Removing them (in consecutive calls) could speed up the flash programming.

My opinion is based on results comparing svf file, generated with Quartus, with the oFL behavior.

shiftVIR vs shiftVDR

First I´d like to discuss something related to this topic: In the svf file the write_enable and bulk_erase commands are shifted after selecting USER0 VDR. While oFL does it after selecting USER1 VIR, right? (e.g. Altera::spi_put)

What does determine which register to use? Why are there differences between oFL and Quartus?

I´d like to get some further understanding on this.

svf file:

!BULK ERASE
!
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1001);  !USER1 DR   -> ADDR[(n – 1)..0] + VIR_VALUE[(m – 1)..0], wobei ADDR den SLD-Node auswählt
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 8 TDI (60);     ! Write enable
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1010);
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 8 TDI (60);     ! Write enable
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1001);
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 8 TDI (E3);     ! Bulk erase
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1010);
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 8 TDI (E3);     ! Bulk erase
RUNTEST 2147483647 TCK;

Altera::spi_put without calling shiftVIR

Regarding the unnecessary shiftVIR calls: As you can see between two page program operations there is no SIR shifting at all. May be this can be done in oFL this way?

As mentioned above I need some further understanding. Afterwards I should be able to provide some patches/help.

svf file:

!PROGRAM
!Set device 0 ncs 0
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1500);
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 8 TDI (01);
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1001);
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 8 TDI (55);
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1001);
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 8 TDI (00);
SIR 10 TDI (00E);   !USER1 VIR
RUNTEST 36 TCK;
SDR 13 TDI (1041);
SIR 10 TDI (00C);   !USER0 VDR
RUNTEST 36 TCK;
SDR 2108 TDI (00055555258555550...000000400060);    ! page program incl. WE + PP cmd + Data
RUNTEST 72600 TCK;
SDR 2108 TDI (00055555555555555...008000400060);    ! page program incl. WE + PP cmd + Data
RUNTEST 72600 TCK;
SDR 2108 TDI (00000000000000000...00C000400060);    ! page program incl. WE + PP cmd + Data
RUNTEST 72600 TCK;
trabucayre commented 2 years ago

Sorry for delay. spiOverJtag for altera use an virtual jtag primitive: the use of this is to address one primitive based on the address (more than one primitive may be present into the gateware): it's why shiftIR is used, the shiftDR following is used to send command. VDR is after, used to send a buffer (for example when you write a page you have to send 3bytes for address followed by data 256bytes). The primitive provides some specifics signals used to control CS. It's always possible to send a burst command + address + buffer directly using virtualDR but I'm not sure it's possible to correctly address primitive used by spiOverJtag nor controling CS. I'm agree in some case with this approach openFPGALoader spend too much time with the Jtag FSM.

In fact I'm not really happy by this implementation: there is some known issues (with chain with more than one device for example) but cyclone hasn't internal OSC so it's maybe not possible to have better approach. But at least I have to check if it's possible to drop VDR when transaction is limited to command without data.

Concerning altera's way to program external flash: it's seems to have a dedicated (not documented) bloc used as passtrhu.

ilynxy commented 1 year ago

but cyclone hasn't internal OSC so it's maybe not possible to have better approach

Cyclone III/IV/10LP has internal OSC: cycloneiii_oscillator(oscena,clkout), cycloneiv_oscillator(oscena,clkout), cyclone10lp_oscillator(oscena,clkout), cyclone***_oscillator(oscena,clkout), 40-80 MHz.

trabucayre commented 1 year ago

It's true: I suppose I have to create two variant: one for cyclone with oscillator and one for family without.