Open somhi opened 2 years ago
What do you say @ravenslofty. Is Mistral in a position where it's worth adding an Edalize backend?
nextpnr-mistral is still very experimental due to the lack of M10K support (we're working on it!)
However, I don't expect any of the commands to meaningfully change.
yosys -p "synth_intel_alm ...; write_json ..."
for synthesisnextpnr-mistral ...
for place and route; I suggest having some way of configuring --compress-rbf
, because you can't JTAG load a compressed bitstream onto a Cyclone V.mistral-cv timing foo.rbf
, which is basically a preview of the new timing codeopenFPGAloader
for loading the bitstream.Does anyone started to work on adding mistral as an Edalize backend? If not, I have time to work on this.
I am able to generate bitstream with fusesoc with a mistral backend for edalize. Very similar to oxide backend. --compress-rbf option passed as a nextpnr_options. What I miss is the optional call for mistral-cv and test script. Test script will be based also on test_oxide source code. Once test script added, I can make a pull request.
Link:
Modified edalize
cv96 blinky
Mistral is merged now. Would be awesome if you could add support in LED to Believe for some board that can use Mistral. Hoping to add SERV support as well eventually once we have memories working
Yes could be great adding Sockit or Chameleon96 Mistral tool in Led to Believe. I'll add it to the to-do list ;)
Thanks @infphyny :)
Then I will add 2 QMTech cyclone V dev boards. The board with 5CEFA2F23 is officially supported by openFPGALoader.
I mean, truthfully the only officially supported board is the DE-10 Nano. By which I mean: it's the only one I have which I can test things on :P
On Thu, 17 Feb 2022, 22:48 infphyny, @.***> wrote:
Then I will add 2 QMTech cyclone V dev boards. The board with 5CEFA2F23 is officially supported by openFPGALoader.
— Reply to this email directly, view it on GitHub https://github.com/olofk/fusesoc/issues/552#issuecomment-1043567447, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALPDW24TOLM7FHFJ5V7KL3U3V3MXANCNFSM5OFRONKA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
Good, I will try first to add DE-10 Nano.
@somhi @infphyny @Ravenslofty Remind me, did we ever get all the way with this one?
nextpnr-mistral has had block RAM support for a long while now, but it was always a little flaky, so I've not exposed it on the Yosys side.
While the timings are very roughly correct, I've been waiting on Sarayan to rewrite the Mistral library to expose timing information.
On Sun, 18 Dec 2022, 01:19 Olof Kindgren, @.***> wrote:
@somhi https://github.com/somhi @infphyny https://github.com/infphyny @Ravenslofty https://github.com/Ravenslofty Remind me, did we ever get all the way with this one?
— Reply to this email directly, view it on GitHub https://github.com/olofk/fusesoc/issues/552#issuecomment-1356612436, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALPDW7DVTPDIB3LQBMR4YTWNZRDLANCNFSM5OFRONKA . You are receiving this because you were mentioned.Message ID: @.***>
It's been, heh, quite a while. But recently I've had the energy to rework some bits of Mistral, and thanks to some "bug fixes and performance improvements", nextpnr-mistral...should be capable of a corescore, thanks to initialised M10K support.
I wonder how it will do.
Thanks for the heads-up. So how do we test this? I would suggest starting by switching over an existing Servant target to use mistral. You have previously said that de10_nano is the best supported one. Is that still the case? If so, could you do an initial test? I could help out with the FuseSoC description and potentially even build a bitstream if it is just a matter of building with latest main branches of yosys and nextpnr
Yes, the DE10-Nano is still the best-supported board by virtue of me having one. I'd be happy to test a bitstream on the board if you send me one. Things should mostly just work with the latest Yosys and nextpnr git versions, though you will need to pass -nodsp
to Yosys.
Very good news, will try to add corescore for de10 nano and QMTech board (5CEFA5F23I7N).
I made a small rgb blinky with value of pwm stored in bram. For QMTech board, when I use 8 M10K nextpnr-mistral is able to generate the bitstream. When I use 16 M10k or more, nextpnr-mistral crash with a std::out_of_range error message. If it doesn't work also for corescore, I will make a repro and debug with gdb to tell in which portion of code nextpnr-mistral generate the error.
Thanks.
Can you get me a gdb backtrace of the error anyway?
I will compile nextpnr-mistral with debug info and will give you the trace.
Error seems in nextpnr/common/kernel/hashlib.h:597 nextpnr_mistral::Arch::getBelPinsForCellPin
Thread 1 "nextpnr-mistral" received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>)
at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: Aucun fichier ou dossier de ce type.
(gdb) backtrace
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>)
at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=<optimized out>)
at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
at ./nptl/pthread_kill.c:89
#3 0x00007ffff583c406 in __GI_raise (sig=sig@entry=6)
at ../sysdeps/posix/raise.c:26
#4 0x00007ffff582287c in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff5ca4f26 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007ffff5cb6f2c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007ffff5cb6f97 in std::terminate() ()
from /lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007ffff5cb71f8 in __cxa_throw ()
from /lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00005555556ddbc6 in nextpnr_mistral::dict<nextpnr_mistral::IdString, nextpnr_mistral::ArchPinInfo, nextpnr_mistral::hash_ops<nextpnr_mistral::IdString> >::at (this=<optimized out>, key=...)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/kernel/hashlib.h:597
#10 0x0000555555745e1d in nextpnr_mistral::Arch::getBelPinsForCellPin (
this=0x555566ddef50, pin=..., cell_info=<optimized out>)
With de10 nano got the same error, forgot to press enter to get more of the backtrace. So here is the complete backtrace
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>)
at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=<optimized out>)
at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
at ./nptl/pthread_kill.c:89
#3 0x00007ffff583c406 in __GI_raise (sig=sig@entry=6)
at ../sysdeps/posix/raise.c:26
#4 0x00007ffff582287c in __GI_abort () at ./stdlib/abort.c:79
#5 0x00007ffff5ca4f26 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007ffff5cb6f2c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007ffff5cb6f97 in std::terminate() ()
from /lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007ffff5cb71f8 in __cxa_throw ()
from /lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00005555556ddbc6 in nextpnr_mistral::dict<nextpnr_mistral::IdString, nextpnr_mistral::ArchPinInfo, nextpnr_mistral::hash_ops<nextpnr_mistral::IdString> >::at (this=<optimized out>, key=...)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/kernel/hashlib.h:597
#10 0x0000555555745e1d in nextpnr_mistral::Arch::getBelPinsForCellPin (
this=0x555566ddef50, pin=..., cell_info=<optimized out>)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/mistral/arc--Type <RET> for more, q to quit, c to continue without paging--
h.h:446
#11 nextpnr_mistral::Context::predictArcDelay (this=0x555566ddef50,
net_info=0x55558c23f4b0, sink=...)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/kernel/context.cc:104
#12 0x00005555557782cf in nextpnr_mistral::TimingAnalyser::get_route_delays (
this=0x7fffffffc800)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/kernel/timing.cc:145
#13 0x00005555559b7a65 in nextpnr_mistral::TimingAnalyser::run(bool) [clone .constprop.0] (this=0x7fffffffc800, update_route_delays=true)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/kernel/timing.cc:50
#14 0x00005555557bbdaf in nextpnr_mistral::HeAPPlacer::place (
this=0x7fffffffc680)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/place/placer_heap.cc:281
#15 0x00005555557c7ef0 in nextpnr_mistral::placer_heap (ctx=0x555566ddef50,
cfg=...)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/place/placer_heap.cc:1812
#16 0x000055555580718a in nextpnr_mistral::Arch::place (this=0x555566ddef50)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/common/kern--Type <RET> for more, q to quit, c to continue without paging--
el/basectx.h:167
#17 0x000055555574112a in nextpnr_mistral::CommandHandler::executeMain (
this=this@entry=0x7fffffffd1b0,
ctx=std::unique_ptr<nextpnr_mistral::Context> = {...})
at /usr/include/c++/12/bits/unique_ptr.h:191
#18 0x0000555555741651 in nextpnr_mistral::CommandHandler::exec (
this=0x7fffffffd1b0) at /usr/include/c++/12/bits/unique_ptr.h:189
#19 0x00005555557119d8 in main (argc=<optimized out>, argv=<optimized out>)
at /home/stche/Documents/Logiciel/FPGA/toolchain/YosysHQ/nextpnr/mistral/main.cc:100
Would it be possible to zip up and send me the build directory, or whatever edalize creates?
Yes I will put on github and give link. Written in vhdl. Will try to run with only one bram on de10 nano board to see if it work.
I'm not asking for the source, I'm asking for the build directory.
Great, it works on de10 nano when using one one bram. I don't use edalize, only plain makefile for now.
Well, I'm looking for the .json
file generated by Yosys, the .qsf
file and the nextpnr-mistral command line
Ok will give you link for those files
I found and fixed the issue last night, and then promptly spent the day sleeping.
Anyway, the issue should be fixed, and it compiles on my end with latest nextpnr.
Thank you, my example work now. I just have to learn how to use a pll with mistral to have a corescore.
Unfortunately PLLs aren't plumbed into nextpnr, since the vendor primitive is an utter mess and I have the irrational hope I can do better than it...
Ok, so if PLLs aren't supported, then I think it's fine to use the clock input directly. Might be a bit tougher for the router, depending on the input frequency though.
@infphyny, it is probably easiest to use an UART as the emitter, like on the icestick target. (Frankly, that is probably how we should do it for all targets eventually.)
1 core works. 50 cores works.Build time ~10 minutes. According to nextpnr-mistral, FMax ~35MHz, but works at 50MHz. Trying 256 cores... , still building without errors after ~45min. Corescore with quartus is 271.
Info: Device utilisation:
Info: MISTRAL_COMB: 66648/83820 79%
Info: MISTRAL_FF: 65330/167640 38%
Info: MISTRAL_IO: 3/ 472 0%
Info: MISTRAL_CLKENA: 1/ 2 50%
Info: cyclonev_oscillator: 0/ 1 0%
Info: cyclonev_hps_interface_mpu_general_purpose: 0/ 1 0%
Info: MISTRAL_M10K: 256/ 553 46%
Will make a pull request this weekend. Maybe I will not choose the highest possible score to have a working bitstream in 10 to 20 minutes.
Well done everyone! Having Mistral at this level with a CoreScore to prove it is fantastic. I just finished building the tools now, and if you share the de10_nano support, I can run a couple of builds on my side too, to see what CoreScore we can reasonably achieve
In general Cyclone V seems to be more challenging to route for nextpnr than, say, ECP5.
I think this comes down to each LAB having pretty major Tile Dispatch congestion. In slightly less technical terms: a LAB is made up of 10 ALMs, and each ALM has 8 inputs, but the LAB itself only has 46 inputs from global routing (through the Tile Dispatch muxes), and four of these inputs are reserved by nextpnr for FF control signals.
(I find it very interesting that the 50-core SoC runs at 50MHz despite signing off at 35MHz (assuming you're using the third Fmax number that nextpnr-mistral prints); I have been assured that routing delays from the analogue simulator match Quartus, so, uh, hm.)
I think both Olof and I would rather you PR the config a bit sooner than the weekend; on my end at least I want to see if there are any obvious things that can be done to improve performance, given, say, a 100-core SoC.
corescore.zip target is de10_nano_mistral. Need a serial to usb adapter to see output on console. I go to work. If something is missing, I will put missing files tonight. Thanks.
Thanks @infphyny! 50 cores finished without problems. Running 150, 260 and 300 now in parallel. Will keep you posted on the results
The default config can't actually place 100 cores, due to severe constraints on synchronous-clears (one SCLR per LAB); to get 100 cores to work I need to modify Yosys to not use the dedicated synchronous-clears...
aha. So no need at this point to try with higher numbers then?
Can I still claim a corescore of 100 if I hacked Yosys? Or do I have to settle for 50?
(This is, for better or worse, just part of the maturing of a toolchain)
I want to have the number that is possible to achieve with upstream tooling, but let's revisit this when the toolchain improves. I think it's already fantastic that we can get a CoreScore at all using Mistral.
One thing that crossed my mind, out of curiousity, would it make any difference to do exchange either yosys or nextpnr to the corresponding Quartus step instead?
sigh
So, the problem there is primarily memory blocks:
initial
block to set up a memory
block, Quartus will read it and turn it into its own format: .mifSo if a design has memories, Yosys can't communicate the memory contents to Quartus for a Yosys->Quartus flow, and Quartus can't communicate the memory contents to Yosys for a Quartus->nextpnr flow.
On Thu, 1 Jun 2023, 19:10 Olof Kindgren, @.***> wrote:
I want to have the number that is possible to achieve with upstream tooling, but let's revisit this when the toolchain improves. I think it's already fantastic that we can get a CoreScore at all using Mistral.
One thing that crossed my mind, out of curiousity, would it make any difference to do exchange either yosys or nextpnr to the corresponding Quartus step instead?
— Reply to this email directly, view it on GitHub https://github.com/olofk/fusesoc/issues/552#issuecomment-1572554063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALPDW3HDV2TQFYGASX24LDXJDLHPANCNFSM5OFRONKA . You are receiving this because you were mentioned.Message ID: @.***>
Ah, I see. So that's where those pesky .mif files come in. Alright. Then I won't spend any effort supporting a mix of Quartus and FOSSi tooling at this point.
A couple of more data points. 84 cores is the maximum I can achieve. 85 cores is stuck in routing. It has done 15000 iterations so far and doesn't look like it will converge. 150 cores is stuck in placer. Attaching the 84-core version if someone with a board wants to check that it works. The archive contains the whole Edalize-generated work dir so you can rebuild it with the makefile if you want too. c84.tar.gz
Tested corescore_0.rbf inside c84 folder. Corey count 6 cores instead of 84. Will do more test on my side. That's great we can build a riscv soc with mistral.
Just an update, mistral compute fmax correctly. I have implemented a simple clock divider. Mistral is able to route the divided clock signal to global route. Now soc run at 12.5 MHz and got 70 cores running without issue. Trying to get 84 cores, the maximum Olof have achieved.
I need support for Project Mistral for my chameleon96 and Terasic Sockit Altera Cyclone V boards A blinky example done with project mistral is done here https://github.com/kprasadvnsi/mistral-CV96-blinky