tum-ei-eda / mlonmcu

Tool for the deployment and analysis of TinyML applications on TFLM and MicroTVM backends
Apache License 2.0
29 stars 12 forks source link

Killed process hangs and suppresses report #56

Open rafzi opened 2 years ago

rafzi commented 2 years ago

A run of mine was killed with SIGKILL and caused a hang. After four SIGINTs (^C below) I got back to the command prompt, but no report was generated for the runs before.

ERROR - The process returned an non-zero exit code -9! (CMD: `/home/user1/ml_on_mcu/venv/bin/python -m tvm.driver.tvmc compile /home/user1/mlenv/temp/sessions/96/runs/4/nasnet.tflite --target c -f mlf --executor aot --runtime crt --pass-config tir.disable_vectorize=True --pass-config relay.moiopt.enable=True --pass-config relay.moiopt.noftp=False --pass-config relay.moiopt.onlyftp=False --pass-config relay.moiopt.norecurse=True --opt-level 3 --input-shapes input_1:[1,224,224,3] --model-format tflite --runtime-crt-system-lib 0 --target-c-constants-byte-alignment 4 --target-c-workspace-byte-alignment 4 --target-c-executor aot --target-c-unpacked-api 0 --target-c-interface-api packed --output /tmp/tmpa7fw4300/default.tar`)
Traceback (most recent call last):
  File "/home/user1/mlonmcu/mlonmcu/session/run.py", line 538, in process
    func()
  File "/home/user1/mlonmcu/mlonmcu/session/run.py", line 433, in build
    self.backend.generate_code()
  File "/home/user1/mlonmcu/mlonmcu/flow/tvm/backend/tvmaot.py", line 119, in generate_code
    out = self.invoke_tvmc_compile(out_path, dump=dump, verbose=verbose)
  File "/home/user1/mlonmcu/mlonmcu/flow/tvm/backend/backend.py", line 228, in invoke_tvmc_compile
    return self.invoke_tvmc("compile", *args, verbose=verbose)
  File "/home/user1/mlonmcu/mlonmcu/flow/tvm/backend/backend.py", line 220, in invoke_tvmc
    return utils.python(*pre, command, *args, live=verbose, env=env)
  File "/home/user1/mlonmcu/mlonmcu/setup/utils.py", line 171, in python
    return exec_getout(sys.executable, *args, **kwargs)
  File "/home/user1/mlonmcu/mlonmcu/setup/utils.py", line 154, in exec_getout
    assert exit_code == 0, "The process returned an non-zero exit code {}! (CMD: `{}`)".format(
AssertionError: The process returned an non-zero exit code -9! (CMD: `/home/user1/ml_on_mcu/venv/bin/python -m tvm.driver.tvmc compile /home/user1/mlenv/temp/sessions/96/runs/4/nasnet.tflite --target c -f mlf --executor aot --runtime crt --pass-config tir.disable_vectorize=True --pass-config relay.moiopt.enable=True --pass-config relay.moiopt.noftp=False --pass-config relay.moiopt.onlyftp=False --pass-config relay.moiopt.norecurse=True --opt-level 3 --input-shapes input_1:[1,224,224,3] --model-format tflite --runtime-crt-system-lib 0 --target-c-constants-byte-alignment 4 --target-c-workspace-byte-alignment 4 --target-c-executor aot --target-c-unpacked-api 0 --target-c-interface-api packed --output /tmp/tmpa7fw4300/default.tar`)
ERROR - [session-96] [run-4] Run failed at stage 'BUILD', aborting...
############### HANG HERE ###################################
^C^CTraceback (most recent call last):
  File "/home/user1/mlonmcu/mlonmcu/session/session.py", line 258, in process_runs
    _join_workers(workers)
  File "/home/user1/mlonmcu/mlonmcu/session/session.py", line 198, in _join_workers
    results.append(w.result())
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 435, in result
    self._condition.wait(timeout)
  File "/usr/lib/python3.9/threading.py", line 312, in wait
    waiter.acquire()
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/user1/mlonmcu/mlonmcu/cli/main.py", line 116, in <module>
    sys.exit(main(args=sys.argv[1:]))  # pragma: no cover
  File "/home/user1/mlonmcu/mlonmcu/cli/main.py", line 107, in main
    args.func(args)
  File "/home/user1/mlonmcu/mlonmcu/cli/flow.py", line 64, in handle
    args.flow_func(args)
  File "/home/user1/mlonmcu/mlonmcu/cli/compile.py", line 108, in handle
    kickoff_runs(args, RunStage.COMPILE, context)
  File "/home/user1/mlonmcu/mlonmcu/cli/common.py", line 191, in kickoff_runs
    success = session.process_runs(
  File "/home/user1/mlonmcu/mlonmcu/session/session.py", line 290, in process_runs
    _join_workers(workers)
  File "/usr/lib/python3.9/concurrent/futures/_base.py", line 628, in __exit__
    self.shutdown(wait=True)
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 229, in shutdown
    t.join()
  File "/usr/lib/python3.9/threading.py", line 1033, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.9/threading.py", line 1049, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
^CException ignored in: <module 'threading' from '/usr/lib/python3.9/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.9/threading.py", line 1415, in _shutdown
    atexit_call()
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 31, in _python_exit
    t.join()
  File "/usr/lib/python3.9/threading.py", line 1033, in join
    self._wait_for_tstate_lock()
  File "/usr/lib/python3.9/threading.py", line 1049, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt: 
^C⏎