Closed drtrigon closed 5 years ago
Are you running the code out of SPI flash by any chance?
Yes I am. Is there a way to speed this up? Cache or anything else?
What setup is needed to achive the numbers given in https://github.com/cliffordwolf/picorv32#cycles-per-instruction-performance ?
Cache or anything else?
Yes, you can create a cache. Or you can just copy all performance-critical code to RAM, or execute from a ROM. Whatever works for you. These are decisions you have to make about the system you are building and have nothing to do with PicoRV32 itself, as PicoRV32 is just the processor core.
What setup is needed to achieve the numbers given in [..]
Just a fast RAM. See https://github.com/cliffordwolf/picorv32/blob/master/dhrystone/testbench.v for the setup.
Sorry for comming back to this late, but my time is restricted... ;)
The project is: https://github.com/drtrigon/fpgarduino-icestorm
I still would like to make this work. As hardware I use an Alhambra board. As development tool icestudio. I can guess of 2 possible ways to speed this up:
delayMicroseconds
that basically just adds the correct number of nop
s@cliffordwolf: The reason why I asked here is beacuse cache controller (L1, L2) is usually implemented in the processor.
Naiive question: is it possible that the following assembler code (excluding the variable declaration) needs 256 cycles, every further iteration (++us) are additional 197 cycles?
I timed it using
with
a
beeing auint32_t
.This seems like a lot just for an add and a compare/branch also considering https://github.com/cliffordwolf/picorv32#cycles-per-instruction-performance.
The reason why I'm asking is I try to implement and equivalent to the Arduino
delayMicroseconds
and this delay code takes between 16-21 us on an Alhambra II board running at 12 MHz.