apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators
https://tvm.apache.org/
Apache License 2.0
11.57k stars 3.43k forks source link

[Bug] [VTA][SIM][AutoTvm] RPCRunner can't run on vta simulator #11160

Open wangshankun opened 2 years ago

wangshankun commented 2 years ago

Code Version:

    [v0.9.dev0 ]

Exec tune_relay_vta.py through VTA interface using SIM device

Host: start a tracker 
      python3 -m tvm.exec.rpc_tracker --host=0.0.0.0 --port=9190

Device: register to host tracker 
    python -m tvm.exec.rpc_server --tracker=10.100.1.24:9190 --key=sim

    vta device config info:
    tvm/3rdparty/vta-hw/config/vta_config.json

image

Check: Dvice is OK

image

But got error:

Traceback (most recent call last): File "/home/code/tvm_learn/tune_relay_vta.py", line 454, in tune_and_evaluate(tuning_option) File "/home/code/tvm_learn/tune_relay_vta.py", line 397, in tune_and_evaluate tune_tasks(tasks, tuning_opt) File "/home/code/tvm_learn/tune_relay_vta.py", line 286, in tune_tasks tuner_obj.tune( File "/home/code/tvm/python/tvm/autotvm/tuner/tuner.py", line 113, in tune measure_batch = create_measure_batch(self.task, measure_option) File "/home/code/tvm/python/tvm/autotvm/measure/measure.py", line 282, in create_measure_batch attach_objects = runner.set_task(task) > File "/home/code/tvm/python/tvm/autotvm/measure/measure_methods.py", line 329, in set_task raise RuntimeError( RuntimeError: Cannot get remote devices from the tracker. Please check the status of tracker by 'python -m tvm.exec.query_rpc_tracker --port [THE PORT YOU USE]' and make sure you have free devices on the queue status.* Exception in thread Thread-2: Traceback (most recent call last): File "/home/anaconda3/envs/Libra_Env/lib/python3.9/threading.py", line 973, in _bootstrap_inner self.run() File "/home/anaconda3/envs/Libra_Env/lib/python3.9/threading.py", line 910, in run self._target(self._args, self._kwargs) File "/home/code/tvm/python/tvm/autotvm/measure/measure_methods.py", line 826, in _check while not dev.exist: # wait until we get an available device File "/home/code/tvm/python/tvm/_ffi/runtime_ctypes.py", line 266, in exist return self._GetDeviceAttr(self.device_type, self.device_id, 0) != 0 File "/home/code/tvm/python/tvm/_ffi/runtime_ctypes.py", line 249, in _GetDeviceAttr return tvm.runtime._ffi_api.GetDeviceAttr(device_type, device_id, attr_id) File "/home/code/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 237, in call raise get_last_ffi_error() tvm._ffi.base.TVMError: Traceback (most recent call last): 12: TVMFuncCall at /home/code/tvm/src/runtime/c_runtime_api.cc:477 11: tvm::runtime::PackedFuncObj::CallPacked(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue) const at /home/code/tvm/include/tvm/runtime/packed_func.h:1217 10: Call at /home/code/tvm/include/tvm/runtime/packed_func.h:1213 9: operator() at /home/code/tvm/src/runtime/c_runtime_api.cc:651 8: tvm::runtime::RPCDeviceAPI::GetAttr(DLDevice, tvm::runtime::DeviceAttrKind, tvm::runtime::TVMRetValue) at /home/code/tvm/src/runtime/rpc/rpc_device_api.cc:43 7: non-virtual thunk to tvm::runtime::RPCClientSession::GetAttr(DLDevice, tvm::runtime::DeviceAttrKind, tvm::runtime::TVMRetValue) 6: Call at /home/code/tvm/include/tvm/runtime/packed_func.h:1213 5: operator() at /home/code/tvm/src/runtime/rpc/rpc_endpoint.cc:677 4: tvm::runtime::RPCEndpoint::HandleUntilReturnEvent(bool, std::function<void (tvm::runtime::TVMArgs)>) at /home/code/tvm/src/runtime/rpc/rpc_endpoint.cc:635 3: WriteWithCallback<tvm::runtime::RPCEndpoint::HandleUntilReturnEvent(bool, tvm::runtime::RPCSession::FEncodeReturn)::<lambda(void, size_t)> > at /home/code/tvm/src/runtime/rpc/../../support/ring_buffer.h:163 2: operator() at /home/code/tvm/src/runtime/rpc/rpc_endpoint.cc:635 1: tvm::runtime::SockChannel::Recv(void, unsigned long) at /home/code/tvm/src/runtime/rpc/rpc_socket_impl.cc:58 0: tvm::support::Socket::Error(char const) at /home/code/tvm/src/runtime/rpc/../../support/socket.h:362 File "/home/code/tvm/src/runtime/rpc/../../support/socket.h", line 362 TVMError: Socket SockChannel::Recv Error:Connection reset by peer**

Reproduction

   Delete  the middle of the return  

image

   And run this cmd

python tvm/vta/tutorials/autotvm/tune_relay_vta.py

cc @elvin-n @icemist

wangshankun commented 2 years ago

same error,using autotvm.LocalRunner image