firesim / firesim

FireSim: Fast and Effortless FPGA-accelerated Hardware Simulation with On-Prem and Cloud Flexibility
https://fires.im
Other
890 stars 232 forks source link

XRT not in `PATH` when running `infrasetup` #1418

Open caizixian opened 1 year ago

caizixian commented 1 year ago

Background Work

FireSim Version and Hash

https://github.com/firesim/firesim/commit/6652b02f563d676c772cd6a24490911552013a4c

OS Setup

Linux alveo 5.15.0-58-generic #64~20.04.1-Ubuntu SMP Fri Jan 6 16:42:31 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.5 LTS
Release:        20.04
Codename:       focal

Other Setup

Current Behavior

Running firesim infrasetup aborts with error.

[localhost] Checking if host instance is up...
[localhost] Copying FPGA simulation infrastructure for slot: 0.
[localhost] Clearing all FPGA Slots.
!!! Parallel execution exception under host 'localhost':
Process localhost:
Traceback (most recent call last):
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/site-packages/fabric/tasks.py", line 217, in _parallel_wrap
    queue.put({'name': name, 'result': task.run(*args, **kwargs)})
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/site-packages/fabric/tasks.py", line 168, in run
    return self.wrapped(*args, **kwargs)
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/site-packages/fabric/decorators.py", line 176, in inner
    return func(*args, **kwargs)
  File "/home/zixianc/chipyard/sims/firesim/deploy/runtools/firesim_topology_with_passes.py", line 432, in infrasetup_node_wrapper
    my_node.instance_deploy_manager.infrasetup_instance()
  File "/home/zixianc/chipyard/sims/firesim/deploy/runtools/run_farm_deploy_managers.py", line 681, in infrasetup_instance
    self.clear_fpgas()
  File "/home/zixianc/chipyard/sims/firesim/deploy/runtools/run_farm_deploy_managers.py", line 661, in clear_fpgas
    with open(temp_file, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/xbutil-examine-out.json'
Fatal error: One or more hosts failed while executing task 'infrasetup_node_wrapper'
Underlying exception:
    No such file or directory
Aborting.
Fatal error.
Traceback (most recent call last):
  File "/home/zixianc/chipyard/sims/firesim/deploy/firesim", line 510, in <module>
    main(args)
  File "/home/zixianc/chipyard/sims/firesim/deploy/firesim", line 449, in main
    t['task'](t['config'](args))
  File "/home/zixianc/chipyard/sims/firesim/deploy/firesim", line 228, in infrasetup
    runtime_conf.infrasetup()
  File "/home/zixianc/chipyard/sims/firesim/deploy/runtools/runtime_config.py", line 666, in infrasetup
    self.firesim_topology_with_passes.infrasetup_passes(use_mock_instances_for_testing)
  File "/home/zixianc/chipyard/sims/firesim/deploy/runtools/firesim_topology_with_passes.py", line 436, in infrasetup_passes
    execute(infrasetup_node_wrapper, self.run_farm, hosts=all_run_farm_ips)
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/site-packages/fabric/tasks.py", line 411, in execute
    error(err, exception=d['results'])
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/site-packages/fabric/utils.py", line 357, in error
    return func(message)
  File "/home/zixianc/chipyard/.conda-env/lib/python3.9/site-packages/fabric/utils.py", line 65, in abort
    raise e
SystemExit: 1
The full log of this run is:
/home/zixianc/chipyard/sims/firesim/deploy/logs/2023-01-30--04-47-09-infrasetup-B5RQEDJES3V37Z9F.log

Expected Behavior

firesim infrasetup runs.

Other Information

It might be related to #1230 and https://groups.google.com/g/firesim/c/yySTMsRZSQI ?

XRT provided by official debs will be installed under /opt/xilinx/xrt and it expected to be used via source /opt/xilinx/xrt/setup.sh. However, the setup script will not be sourced when use non-login non-interactive SSH (which I believe is what happens when firesim interacts with the host via fabric).

I worked around the problem by setting the environment variables manually in /etc/environment.

ncppd commented 1 year ago

@caizixian Have you tried the firesim runworkload step? I am stuck with this error #1420 .

caizixian commented 1 year ago

I'm able to run workload (buildroot) after setting the environment variables in /etc/environment