tenstorrent / tt-buda

Tenstorrent TT-BUDA Repository
Other
162 stars 21 forks source link

[bug report] BUDA fails to run (but does compile) Fast Neural Style Transfer on e75 with error `grayskull_2560_0x0.yaml does not exist` #28

Open marty1885 opened 1 month ago

marty1885 commented 1 month ago

Please refer to the original discord thread for details.

Currently BUDA seems to be able to compile fast-neural-style-transfer from the ONNX model zoo. But fails during execution with the following error. The source code is available here

https://github.com/marty1885/fast-style-transfer-ttbuda

2024-06-06 10:57:05.399 | INFO     | Balancer        - Balancer perf score : 0.78140587
2024-06-06 10:57:08.638 | WARNING  | Placer          - Compilation Option with input queue placed on host, but Grayskull does not support fast device reads from host. Placer opting to allocate the queue on device instead.
2024-06-06 10:57:08.677 | INFO     | Backend         - Lookup contexts -- arch:system scope:device0 name:harvesting_mask
2024-06-06 10:57:08.678 | ERROR    | pybuda.device:run_next_command:469 - Compile error: Error: device descriptor file /tmp/marty/2d44d86576ee/device_descs/grayskull_2560_0x0.yaml does not exist!
Traceback (most recent call last):
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/device.py", line 458, in run_next_command
    ret = self.compile_for(
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/ttdevice.py", line 820, in compile_for
    self._compile_output = pybuda_compile_from_context(compile_context)
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/compile.py", line 248, in pybuda_compile_from_context
    next_stage = stage_to_func[current_stage](context)
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/compile.py", line 992, in run_balancer_and_placer
    context.post_placer_results = run_post_placer_buda_passes(context.lowered_graph, context.graph_name, context.device_cfg, context.placer_solution, context.post_placer_config, context.balancer_solution, instructions, allocated_blocks, current_host_address)
RuntimeError: Error: device descriptor file /tmp/marty/2d44d86576ee/device_descs/grayskull_2560_0x0.yaml does not exist!

Traceback (most recent call last):
  File "/home/marty/Documents/fast-style-transfer-ttbuda/main.py", line 28, in <module>
    outqueue = pybuda.run_inference()
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/run/api.py", line 90, in run_inference
    return _run_inference(module, inputs, input_count, output_queue, _sequential, _perf_trace, _verify_cfg)
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/run/impl.py", line 277, in _run_inference
    return _run_devices_inference(
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/run/impl.py", line 467, in _run_devices_inference
    output_queue = _initialize_pipeline(False, output_queue, sequential=sequential, verify_cfg=verify_cfg)
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/run/impl.py", line 414, in _initialize_pipeline
    _compile_devices(sequential, training=training, sample_inputs=sample_inputs, sample_targets=sample_targets, microbatch_count=microbatch_count, verify_cfg=verify_cfg)
  File "/home/marty/micromamba/envs/buda/lib/python3.10/site-packages/pybuda/run/impl.py", line 1248, in _compile_devices
    raise ret
RuntimeError: Error: device descriptor file /tmp/marty/2d44d86576ee/device_descs/grayskull_2560_0x0.yaml does not exist!
2024-06-06 10:57:08.695 | DEBUG    | pybuda.run.impl:_shutdown:1265 - PyBuda shutdown
staylorTT commented 1 month ago

@nvukobratTT can you help me find someone to look at this issue?