darklife / darkriscv

opensouce RISC-V cpu core implemented in Verilog from scratch in one night!
BSD 3-Clause "New" or "Revised" License
1.95k stars 275 forks source link

Is this board named khc40gbe_k420? #32

Open YangWang92 opened 3 years ago

YangWang92 commented 3 years ago

Hi all, I bought a board from Aliexpress and it looks like this k420t.

Is this board named khc40gbe_k420? Thanks a lot!

https://github.com/darklife/darkriscv/tree/master/boards/aliexpress_hpc40gbe_k420

samsoniuk commented 3 years ago

It is the same board, but different versions: mine board is the Kintex-7 HPC V2 and your is the Kintex-7 HPC V3!

I guess the boards are very similar, but maybe you need double-check the pinout in the schematic.

In my preliminary tests with ISE 14.7 the DarkRISCV worked fine at 220MHz:

telegram-cloud-photo-size-1-5174755887791843483-x_partial

YangWang92 commented 3 years ago

I have confirmed that I got the same board named v2 with three jtag headers. :)

The seller gets start to sell K420v3 and KU040 on Aliexpress again.

Thanks a lot!

samsoniuk commented 3 years ago

@YangWang92 do you have the link for the new K420v3 board?

I got my board from the "FPGA board store", but the product is not available anymore :(

https://www.aliexpress.com/item/32956526454.html

In fact, the "FPGA board store" appears to be closed (very sad)... Anyway, I found some interesting boards in the "HPC FPGA board store", which appears to be connected to the old FPGA board store. In special, I am interested in buy this KU040 board w/ 2x40Gbps:

https://www.aliexpress.com/item/4001302554837.html

The board is very interesting for servers because it is low-profile and can be powered by a KU040 or KU060. Unfortunately, it lacks the DDR3, but depending of the application is possible use the main computer memory via the PCIex8v3 bus.

YangWang92 commented 3 years ago

@samsoniuk Here is the link https://item.taobao.com/item.htm?spm=a1z09.2.0.0.4dca2e8dvGmSwd&id=6214809309&_u=m1pd6eck89c4 and the seller designed these boards, and I guess that "FPGA board store" is an agent.

BTW, he is designing a ku040 board with ddr4 support. https://item.taobao.com/item.htm?spm=a1z10.3-c.w4002-3589537003.9.4b732890iUfRPj&id=577331392431

samsoniuk commented 3 years ago

Anyway, I will keep the issue open, since I am planning introduce some new features regarding the use of the DarkRISCV in HPC applications. According to my initial calculations, it is possible fit between 80 and 160 cores, depending of the complexity of the glue-logic around the core. Supposing a cluster of 128 cores, each one running at 200MHz, we get a peak performance of 25600 MIPS! :)

YangWang92 commented 3 years ago

wow, will you build a manycore design? I'm really interested in this design and its applications. Maybe we can build smartnic/programmble NICs or programmable accelerator cards based on this design.

Do you know this work? http://fpga.org/2019/08/19/2grvi-phalanx-at-hot-chips-31-2019/

This might be a good reference design for your work.

samsoniuk commented 3 years ago

@YangWang92 yes yes, I am aware about the work from Jan Gray! In fact, the DarkRISCV is very influenced by the same concepts, such as the Jan’s Razor: “In a chip-multiprocessor, cut inessential resources from each CPU, to maximize CPUs per die.” Of course, Jan have much more experience designing compact soft-processors, in a way that the GRVI core can fit in only 320 LUTs and work at 375MHz in a Ultrascale+ VU9P.

The DarkRISCV core requires around 1kLUTs and can work only at 220MHz in a Kintex-7, but at least it is open source and can be easily adapted for different scenarios. In the case of the K420, my initial plan is replace the DarkSoCV top level by an "DarkHPC" block composed by the DarKRISCV core and a pair of memories (ROM/RAM), everything replicated in a generate loop, in a way that each memory is connected to a DarkRISCV core in one port and the other port is memory mapped via PCIe, in a way that the PC can easily read/write all ROM/RAM memories. A more complex scenario is add one ethernet per core and try implement a switch, in a way that the traffic from the 2x40GbE interfaces are distributed in 80x1GbE interfaces and 80 DarkRISCV cores running at 200MHz, each one running some kind of very optimized code to handle packets... but, of course, that plans are for the future!

For now it is a very good to know that the Kintex-7 HPC board is not dead there other new boards on the way. In my case I guess it is not possible buy directly from taobao.com, since I am outside chine, but I hope is possible request to some AliExpress stores located at HK to buy and forward the boards.

nightseas commented 3 years ago

@samsoniuk I'm just looking at the same direction on many-core HPC applications, plan to utilize DarkRISCV to implement similar thing as GRVI.

The biggest challenge is not stacking up cores in FPGA, but to set up a proper programming and processing model. Usually on an many-core DSP based accelerator there is are schedulers to schedule the resources (e.g. cores) and tasks, and different parallel programming models for solutions from varies of vendors. The idea of using generic complier and C for accelerator is nice but I can't figure out how to manage the cores and data being processed in an unified way, the cores are more ore less work independently. Meanwhile supporting language like CUDA or OpenCL will take huge efforts, which doesn't look good for small open project.

The PCIe MM solution you mentioned is a good start, and probably can be improved by introducing XDMA and AXI-stream for kernel (instruction) downloading and data transfer. Then adding control interface (e.g. a ring bus) will make it possible to control the status (e.g. run/stop) and collect info of cores. Ethernet controller for each core sounds weird. Ethernet software stack takes a lot processing resources for an MCU level processor.

I'm new to RSICV but willing to join and contribute.

YangWang92 commented 3 years ago

@nightseas I agree with you that the programming model and compiler is the most important part of software design.

It should be a PL built for MIMT architecture. Project "OpenCelerity" build a "CUDA-like" program language, it might be an example for us.

BTW, Which NoC design will you use in DarkRISCV? It is also an important design choice for us.