Open SebastianHambura opened 1 year ago
Hi, I have a few comments/queries, hopefully we can get this working.
Have you been able to run one of the examples without issue? e.g. sklearn_to_hdl.py
would be a good start for the HDL backend.
As well as the VHDL files produced, there will be some scripts for simulation and synthesis. Those can be a good starting point to jump from the conifer python to a Vivado project. There is support for GHDL simulation in conifer, you can access that by setting conifer.backends.vhdl.simulator = conifer.backends.vhdl.GHDL
before starting conversion of the model. When you write
/compile
/build
the project after that, a script ghdl_compile.sh
will be written out. It's not so different from the one you posted, but there are some more flags for the libraries and VHDL 2008 setting.
I tried doing something like you described, adding ghdl --synth --work=BDT --std=08 --out=verilog BDT.BDTTop > BDTTop.v
to the ghdl_compile.sh
script, then fiddling the VHDL testbench BDTTestbench.vhd
to instantiate the Verilog module, and run the whole thing from Vivado 2020.1. I also see only X
s for the output. I'll keep digging into it, but the unmodified workflow does work for running synthesis and simulation.
I do also see this AddReduce.vhd:28:11:warning: declaration of "addreduce" hides entity "addreduce" [-Whide]
warning from GHDL. I will look into whether that's the issue.
Is there any reason to choose one backend over another for your project? It might be worth trying the HLS backend and exporting the C Synthesized model as Verilog if that's how you'll integrate it later.
Also, AddReduce.vhd:28:11:warning: declaration of "addreduce" hides entity "addreduce" [-Whide]
this warning occurs because of recursive instantiation of the component.
sklearn_to_hdl.py
is working without issue 👍
I can also import the VHDL files into Vivado and run the simulation (I just need to manually change the library from the files to BDT). I'm still a bit confused why it's both complaining about a missing file and running the simulation fine, but whatever.
I haven't used HLS yet and from your paper the RTL backend seemed to perform better, so I would have preferred to use the VHDL backend.
I started going this route because model.build()
was taking very long and then failing, so I trying to manually import the VHDL files into Vivado, which also failed (probably because I'm not used to working with VHDL). I found a potential culprit though : for my xgboost model (10 trees, up to depth 7) the generated Arrays0.vhd is 756.6 kB, while the one generated from the example is only 13kB. It could be that I'm hitting some memory issues with Vivado when trying to synthesize this model. At least once model.decision_function
failed and I got this in ghdl.log :
/snap/ghdl/1/bin/ghdl:error: declaration of a too large object (359 > --max-stack-alloc=128 KB)
in process .testbench(rtl).input@simulationinput(rtl).P0
from: bdt.types.to_tyarray2d
from: bdt.arrays0.ELAB_SPEC
from: bdt.arrays0.ELAB_BDY
from: bdt.bdttop.PKG_ELAB
from: xil_defaultlib.testbench(rtl).uut.CMP_ELAB
from: xil_defaultlib.testbench(rtl).DECL_ELAB
from: Elaboration of design
/snap/ghdl/1/bin/ghdl:error: error during elaboration
I've tried adding--work=BDT --std=08
to my commands, but I still get these warnings :
AddReduce.vhd:59:10:warning: no assignment for offsets 0:17 of signal "dint"
signal dInt : tyArray(0 to intLen - 1) := (others => (others => '0'));
Do you also have them, or is it an issue from my side ?
From the example : DEBUG:conifer.model:Converted BDT with parameters {"max_depth": 3, "n_trees": 20, "n_features": 10, "n_classes": 2}
From my code : DEBUG:conifer.model:Converted BDT with parameters {"max_depth": 10, "n_trees": 10, "n_features": 30, "n_classes": 2}
Are you always producing fully-fledged tree all the time ? This is one of the tree from my xgboost model. If you're also generating the nodes which are not used/not existing, I can see how that could be a memory/performance issue further down the synthesis pipeline
Regarding depth and sparsity, since v1.0
the HLS backend may perform better for sparse trees (this is also much more recent than the paper). Indeed as you suggest, the VHDL backend currently pads sparse trees to make them full, which can be quite costly for synthesis time. It's on the to-do list to give the VHDL the same treatment to avoid padding the trees, but it's not yet implemented*. The padded trees shouldn't contribute much to extra resource usage, but definitely hurts synthesis (and simulation evidently).
I left some information and performance plots on the PR #41 that introduced this HLS improvement for sparse trees. The change in resource usage and latency is small, but the improvement in synthesis time is significant.
I've been evaluating the HLS backend with recent Vitis HLS versions (2023.1), which also tends to synthesize faster than older versions of the software.
I started going this route because model.build() was taking very long and then failing
What is the actual failure? Is there any information in the Vivado log file?
Do you also have them, or is it an issue from my side ?
I saw this warning when running the Verilog export/synthesis, but not for the out-of-the-box VHDL simulation
I've tried with varying max-depth parameters in xgboost, and to me it looks like my computer doesn't have enough memory to build the model.
Max depth | Run time | Vivado Peak memory usage | Result |
---|---|---|---|
7 | 1:16 | 2936 MB | ✓ |
8 | 4:48 | 3166 MB | ✓ |
9 | 14:05 | 5624 MB | ✓ |
10 | 27:43 | 14159 MB | ❌ |
And when it fails, there is no specific error message :
ERROR:conifer.backends.vhdl.writer:build failed, check build.log
/media/sebastian/Software/Xilinx/Vivado/2020.1/bin/loader: line 286: 25252 Killed "$RDI_PROG" "$@"
``` --------------------------------------------------------------------------------- Finished Synthesize : Time (s): cpu = 00:26:23 ; elapsed = 00:26:36 . Memory (MB): peak = 14159.375 ; gain = 12008.105 ; free physical = 1535 ; free virtual = 4312 --------------------------------------------------------------------------------- --------------------------------------------------------------------------------- Finished Constraint Validation : Time (s): cpu = 00:26:56 ; elapsed = 00:27:12 . Memory (MB): peak = 14159.375 ; gain = 12008.105 ; free physical = 1958 ; free virtual = 4635 --------------------------------------------------------------------------------- --------------------------------------------------------------------------------- Start Loading Part and Timing Information --------------------------------------------------------------------------------- Loading part: xczu3eg-sfvc784-1-e ```
``` /snap/ghdl/1/bin/ghdl:error: declaration of a too large object (359 > --max-stack-alloc=128 KB) in process .testbench(rtl).input@simulationinput(rtl).P0 from: bdt.types.to_tyarray2d from: bdt.arrays0.ELAB_SPEC from: bdt.arrays0.ELAB_BDY from: bdt.bdttop.PKG_ELAB from: xil_defaultlib.testbench(rtl).uut.CMP_ELAB from: xil_defaultlib.testbench(rtl).DECL_ELAB from: Elaboration of design /snap/ghdl/1/bin/ghdl:error: error during elaboration ```
So I think Vivado crashes because it's out of memory to use.
Max depth | Run time | HLS Peak memory usage | Result |
---|---|---|---|
8 | 04:33 | 3249 MB | ✓ |
9 | 17:51 | 9809 MB | ✓ |
10 | 1:19:26 | 6393 MB and then fail | ❌ |
``` INFO: [HLS 200-489] Unrolling loop 'Activate' (firmware/BDT.h:65) in function 'BDT::Tree<0, 2047, 2038, ap_fixed<18, 8, (ap_q_mode)5, (ap_o_mode)3, 0> [30], ap_fixed<18, 8, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<18, 8, (ap_q_mode)5, (ap_o_mode)3, 0> >::decision_function' completely with a factor of 2047. INFO: [HLS 200-489] Unrolling loop 'Loop-3' (firmware/BDT.h:87) in function 'BDT::Tree<0, 2047, 2038, ap_fixed<18, 8, (ap_q_mode)5, (ap_o_mode)3, 0> [30], ap_fixed<18, 8, (ap_q_mode)5, (ap_o_mode)3, 0>, ap_fixed<18, 8, (ap_q_mode)5, (ap_o_mode)3, 0> >::decision_function' completely with a factor of 2038. INFO: [XFORM 203-101] Partitioning array 'x.V' (firmware/my_prj.cpp:4) in dimension 1 completely. INFO: [XFORM 203-101] Partitioning array 'score.V' (firmware/my_prj.cpp:4) in dimension 1 completely. INFO: [XFORM 203-101] Partitioning array 'scores' in dimension 1 completely. ERROR: [XFORM 203-103] Array 'tree_9_0.feature' : partitioned elements number (2047) has exeeded the threshold (1024), which may cause long run-time. ERROR: [HLS 200-70] Pre-synthesis failed. command 'ap_source' returned error code while executing "source build_prj.tcl" ("uplevel" body line 1) invoked from within "uplevel \#0 [list source $arg] " INFO: [Common 17-206] Exiting vivado_hls at Thu Aug 10 11:50:54 2023... ```
I think that for our purpose, a max depth of 8 should still be able to work. I need to check with my team. And maybe I can just increase the memory of my machine to see if that allow to make bigger trees.
Concerning the manual unrolling of trees, would it not be better to instead do recursive module instantiation ? Then we could create sparse trees in HDL code.
Thanks for studying this in such detail! Killed "$RDI_PROG" "$@"
is definitely the out-of-memory error.
Hi,
I've started working with this tool. I tried to convert a simple xgboost tree into VHDL. Conifer is creating the files without a problem, however when I try to import them in Vivado it's a mess. After struggling a bit with the library names (it's my first time using VHDL, usually I'm working with Verilog), I decide to go another route and convert the VHDL code into a Verilog module using GHDL (and/or Yosys).
In case someone is curious, that's how I'm currently doing it :
``` ghdl -a Constants.vhd ghdl -a Types.vhd ghdl -a AddReduce.vhd ghdl -a Arrays0.vhd ghdl -a Tree.vhd ghdl -a BDT.vhd ghdl -a BDTTop.vhd ghdl --synth --out=verilog bdttop > BDTTop.v ```
It's seems to be working, and while my VHDL attemps crashed Vivado, the Verilog file plays very nicely.
However I get following warnings :
And indeed, when I try to run a simulation with Vivado I find that the output signal from the tree is 'XXXXXXXX', and I have some internal ZZZs .
Is this an issue from your side, or did something go wrong during the translation process ?
(I'm using Vivado 2020.1)