google / CFU-Playground

Want a faster ML processor? Do it yourself! -- A framework for playing with custom opcodes to accelerate TensorFlow Lite for Microcontrollers (TFLM). . . . . . Online tutorial: https://google.github.io/CFU-Playground/ For reference docs, see the link below.
http://cfu-playground.rtfd.io/
Apache License 2.0
475 stars 122 forks source link

Digilent Arty xc7a100t failure #606

Closed rajsaktish closed 2 years ago

rajsaktish commented 2 years ago

Hello @tcal-x

I am using the Digilent Arty board containing the 100T device with the CFU Playground on the default proj_template example.

With the following command line, I see the following failure in the symbiflow_write_fasm stage. Can you please provide pointers to get past this and {program, load subsequently}

Command line:

%make prog TARGET=digilent_arty USE_SYMBIFLOW=1 EXTRA_LITEX_ARGS="--variant=a7-100"

Terminal messages:

python3 -m litex.soc.software.mkmscimg bios.bin --little python3 -m litex.soc.software.memusage bios.elf /home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/build/digilent_arty.proj_template/software/bios/../include/generated/regions.ld riscv64-unknown-elf

ROM usage: 24.86KiB (19.42%) RAM usage: 1.60KiB (20.02%)

rm crt0.o make[4]: Leaving directory '/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/build/digilent_arty.proj_template/software/bios' INFO:SoC:Initializing ROM rom with contents (Size: 0x6380). INFO:SoC:Auto-Resizing ROM rom from 0x20000 to 0x6380. make[4]: Entering directory '/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/build/digilent_arty.proj_template/gateware' symbiflow_synth -t digilent_arty -v /home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/proj/proj_template/cfu.v /home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/third_party/python/pythondata_cpu_vexriscv/pythondata_cpu_vexriscv/verilog/VexRiscv_FullCfu.v /home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/build/digilent_arty.proj_template/gateware/digilent_arty.v -d artix7 -p xc7a100tcsg324-1 -x digilent_arty.xdc > /dev/null symbiflow_pack -e digilent_arty.eblif -d xc7a100t_test -s digilent_arty.sdc > /dev/null symbiflow_place -e digilent_arty.eblif -d xc7a100t_test -n digilent_arty.net -P xc7a100tcsg324-1 -s digilent_arty.sdc > /dev/null Warning: IDELAY_GROUPS parameters are currently being ignored! symbiflow_route -e digilent_arty.eblif -d xc7a100t_test -s digilent_arty.sdc > /dev/null symbiflow_write_fasm -e digilent_arty.eblif -d xc7a100t_test > /dev/null /home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/env/symbiflow/bin/vpr_common: line 116: 115099 Killed genfasm ${ARCH_DEF} ${EBLIF} --device ${DEVICE_NAME} ${VPR_OPTIONS} --read_rr_graph ${RR_GRAPH} $@ make[4]: * [Makefile:46: digilent_arty.fasm] Error 137 make[4]: Leaving directory '/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/build/digilent_arty.proj_template/gateware' Traceback (most recent call last): File "./common_soc.py", line 57, in main() File "./common_soc.py", line 53, in main workflow.run() File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/board_specific_workflows/general.py", line 125, in run soc_builder = self.build_soc(soc) File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/board_specific_workflows/digilent_arty.py", line 73, in build_soc return super().build_soc(soc, kwargs) File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/board_specific_workflows/general.py", line 102, in build_soc soc_builder.build(run=self.args.build, kwargs) File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/third_party/python/litex/litex/soc/integration/builder.py", line 350, in build vns = self.soc.build(build_dir=self.gateware_dir, *kwargs) File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/third_party/python/litex/litex/soc/integration/soc.py", line 1147, in build return self.platform.build(self, args, kwargs) File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/third_party/python/litex/litex/build/xilinx/platform.py", line 73, in build return self.toolchain.build(self, args, kwargs) File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/third_party/python/litex/litex/build/xilinx/symbiflow.py", line 241, in build _run_make() File "/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/third_party/python/litex/litex/build/xilinx/symbiflow.py", line 84, in _run_make raise OSError("Error occured during Symbiflow's script execution.") OSError: Error occured during Symbiflow's script execution. make[3]: [/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc/common_soc.mk:115: build/digilent_arty.proj_template/gateware/digilent_arty.bit] Error 1 make[3]: Leaving directory '/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/soc' make[2]: *** [../proj.mk:310: prog] Error 2 make[2]: Leaving directory '/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/proj/proj_template'

Thanks Raj

danc86 commented 2 years ago
/home/rajsaktish/repos/enhance_CFU/as_is/CFU-Playground/env/symbiflow/bin/vpr_common: line 116: 115099 Killed genfasm

This means the genfasm process was killed by SIGKILL. Often it's because the OOM killer has triggered because the system has run out of memory. There's usually a corresponding kernel message (try journalctl -b -k) in that case.

Are you running this inside a VM? How much memory does it have available?

I'm not sure how much memory the genfasm step is expected to consume, but it might be relatively large.

rajsaktish commented 2 years ago

Thanks @danc86 .

My desktop has 8GB memory. I am running the commands natively on linux, without virtual machines.

Tail lines of journalctl -k -b below:

$ grep -i memtotal /proc/meminfo MemTotal: 8045528 kB

$ tail jctl_log.txt Jun 23 18:27:29 desktop kernel: [ 115649] 1000 115649 656 0 40960 24 0 sh Jun 23 18:27:29 desktop kernel: [ 115650] 1000 115650 2411 4 53248 133 0 pyrun Jun 23 18:27:29 desktop kernel: [ 115656] 1000 115656 19392 130 204800 14983 0 python3 Jun 23 18:27:29 desktop kernel: [ 119744] 1000 119744 2148 46 53248 56 0 make Jun 23 18:27:29 desktop kernel: [ 119851] 1000 119851 657 0 45056 24 0 sh Jun 23 18:27:29 desktop kernel: [ 119852] 1000 119852 2411 4 57344 106 0 symbiflow_write Jun 23 18:27:29 desktop kernel: [ 119862] 1000 119862 2259549 1678924 13971456 28008 0 genfasm Jun 23 18:27:29 desktop kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service,task=genfasm,pid=119862,uid=1000 Jun 23 18:27:29 desktop kernel: Out of memory: Killed process 119862 (genfasm) total-vm:9038196kB, anon-rss:6715692kB, file-rss:4kB, shmem-rss:0kB, UID:1000 pgtables:13644kB oom_score_adj:0 Jun 23 18:27:29 desktop kernel: oom_reaper: reaped process 119862 (genfasm), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Any suggestions to make it work ?

Thanks Raj

danc86 commented 2 years ago

The output you pasted shows genfasm is consuming ~6.7GB of memory when the OOM killer is triggered.

I also found this old issue about the same problem (genfasm consuming 7GB when targetting Arty 100T):
https://github.com/chipsalliance/f4pga-examples/issues/37

So unfortunately it seems to be a known issue and probably the only workaround is to run on a system with more memory available (or swap I suppose).

FYI I tried it on my workstation (proj_template for Arty 100T) and genfasm goes up to 7618316KiB max resident size according to /usr/bin/time. So you may be able to build it with only slightly more memory or swap (a few more GB).

rajsaktish commented 2 years ago

Trying it on a machine with larger memory seemed to work. Thanks @danc86