Closed amithmath closed 1 year ago
Hi,
We have an open-source example of BP+manycore here. This is an FPGA prototype of an ASIC we have taped out, so the default configuration is much smaller so as to fit. But if you’re simulating or taping out, you can easily scale up the cores as needed.
https://github.com/black-parrot-hdk/zynq-parrot/tree/master/cosim/hammerblade-example
I got your point. If I want to tape-out bsg_f1 TSMC 16 nm FinFET, one has to communicate bsg_f1 through PCIe and DRAM. How about these IPs for ASIC tape-outs? Do you have any of these IPs? If not, do I have to use Cadence/Synopsys IPs?
https://github.com/bespoke-silicon-group/basejump_stl/tree/master/bsg_link
We use bsg_link as an off-chip DDR tunnel for I/O. We have taped out in 12 and 28. We typically use off-the-shelf LVCMOS I/Os and live with the modest bandwidth. For PCIE specifically you need analog components so probably will need to purchase.
https://github.com/bespoke-silicon-group/basejump_stl/tree/master/bsg_dmc
We also have an LPDDR1 controller, this has been taped out in 28. To adapt to a new DDR module is non-trivial, but fairly straightforward. Otherwise, the two main options are 1) using bsg_link and bsg_channel_tunnel to tunnel your DRAM traffic along with your IO traffic or 2) yes, purchasing an off-the-shelf DRAM IP. Option 1 is solid and cheap in terms of IP as well as pin cost.
Assuming if I go for option 1) only ddr+pcie+bsg_f1 won't suffice because, if one turn on the card, I think it initially pass through boot sequence before ready to use as stand alone accelerator card. In case of FPGA cards, I guess this initially taken care by hard processor on the board, it could be ARM or Microblaze. In case of stand alone ASIC accelerator card, I have no idea about boot sequence code/logic and processor to boot. Any suggestions?
Would have to know more about your intended use-case to make a recommendation. If you wanted a standalone card, BlackParrot is able to boot from a ROM and then could bootstrap the system from an on-board flash.
We typically make test chips with many redundancies, so I/O are handled from the PS of an attached FPGA board such as DoubleTrouble. You can find these (open-source) designs here: http://bjump.org/index.html
I want to make it as stand alone GP GPU accelerator card for general purpose computations like NVIDIA graphics cards. Any recommendations?
Gotcha, that is a complex task well outside the scope of this repo. Happy to have a chat offline, if you want to discuss further at petrisko@cs.washington.edu
Okay, I will email you.
Hi,
I was reading the paper: The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric, I guess 511 core RISC-V are connected to RISC-V 64 bit Rocket/BlackParrot cores. But the bsg_replicant/bsg_f1 (old repo) repo 511-Cores RISC-V are communicating host CPU through PCIe slot. The same has been mentioned Hammer Blade Technical reference guide (https://docs.google.com/document/d/1b2g2nnMYidMkcn6iHJ9NGjpQYfZeWEmMdLeO_3nLtgo/edit) Page 6, foot note 1 says: The current version of HammerBlade does not yet include BlackParrot cores, but will soon. Instead, it is controlled by a Linux host over PCIe that connects to an I/O node on the manycore, much like a discrete GPU.
I am looking for repo where 511 RISCV Cores are conntected to BlackParrot. Can you please point me to that repo?
I did check these repos: https://github.com/black-parrot and https://github.com/bespoke-silicon-group, I am unable to locate.
Many Thanks, -Amit