Open trivedisarthak opened 11 months ago
Hi, Have checked that,you have installed all the prerequisite packages outside the dockers?
בתאריך יום ב׳, 17 ביולי 2023, 14:22, מאת Sarthak Trivedi < @.***>:
Hi,
I'm trying to convert YOLOv7 model trained on Crowdhuman dataset to HEF. I followed the optimization tutorial https://hailo.ai/developer-zone/documentation/dataflow-compiler-v3-24-0/?sp_referrer=DFC_2_Model_Optimization_Tutorial.html ; I'm trying to optimize the network with optimization level set to 2 with all the other options set according to the alls files provided in hailo model zoo for YOLOv7. I'm using the latest Hailo Software Suite - Docker image. I get the following error:
[info] Translation completed on ONNX model yolov7[2023-07-17 10:01:27,146][hailo_sdk.client][INFO] - Translation completed on ONNX model yolov7[info] Initialized runner for yolov7[2023-07-17 10:01:27,703][hailo_sdk.client][INFO] - Initialized runner for yolov7[info] Loading model script to yolov7 from string[2023-07-17 10:01:31,186][hailo_sdk.client][INFO] - Loading model script to yolov7 from string[info] Starting Model Optimization[2023-07-17 10:03:11,265][hailo_sdk.client][IMPORTANT] - Starting Model Optimization2023-07-17 10:03:11.617384: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.624954: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.625081: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.625693: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMATo enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.2023-07-17 10:03:11.626656: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.626763: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.626856: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.938253: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.938391: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.938492: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:975] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2023-07-17 10:03:11.938559: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.2023-07-17 10:03:11.938578: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 19157 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:08:00.0, compute capability: 8.6[info] Using calibration set of 1500 entries[2023-07-17 10:03:13,403][hailo_sdk.client][INFO] - Using calibration set of 1500 entries[info] Assigning 16bit activation to output layer yolov7/output_layer3[2023-07-17 10:03:13,405][hailo_sdk.client][INFO] - Assigning 16bit activation to output layer yolov7/output_layer3[info] Assigning 16bit activation to output layer yolov7/output_layer2[2023-07-17 10:03:13,407][hailo_sdk.client][INFO] - Assigning 16bit activation to output layer yolov7/output_layer2[info] Starting auto 4bit weights[2023-07-17 10:03:13,408][hailo_sdk.client][INFO] - Starting auto 4bit weights[info] Assigning 4bit weights to layer yolov7/conv91 with 4719.62k parameters[2023-07-17 10:03:13,412][hailo_sdk.client][INFO] - Assigning 4bit weights to layer yolov7/conv91 with 4719.62k parameters[info] Assigning 4bit weights to layer yolov7/conv35 with 2359.81k parameters[2023-07-17 10:03:13,412][hailo_sdk.client][INFO] - Assigning 4bit weights to layer yolov7/conv35 with 2359.81k parameters[info] Assigning 4bit weights to layer yolov7/conv46 with 2359.81k parameters[2023-07-17 10:03:13,412][hailo_sdk.client][INFO] - Assigning 4bit weights to layer yolov7/conv46 with 2359.81k parameters[info] Ratio of weights in 4bit is 0.26[2023-07-17 10:03:13,412][hailo_sdk.client][INFO] - Ratio of weights in 4bit is 0.26[info] auto4bit completion time 00:00:00.00[2023-07-17 10:03:13,412][hailo_sdk.client][INFO] - auto4bit completion time 00:00:00.00[info] Auto 4bit weights is done[2023-07-17 10:03:13,412][hailo_sdk.client][INFO] - Auto 4bit weights is done[info] Starting Stats Collector[2023-07-17 10:03:17,383][acceleras][INFO] - Starting Stats CollectorCalibration: 0%| | 0/1500 [00:00<?, ?entries/s]2023-07-17 10:03:19.256664: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.2023-07-17 10:03:19.970407: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 81012023-07-17 10:03:20.385517: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory2023-07-17 10:03:20.386139: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory2023-07-17 10:03:20.386149: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version2023-07-17 10:03:20.386542: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory2023-07-17 10:03:20.386575: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxasRelying on driver to perform ptx compilation. Modify $PATH to customize ptxas location.This message will be only logged once.2023-07-17 10:03:55.010791: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 314572800 exceeds 10% of free system memory.2023-07-17 10:03:55.010824: W tensorflow/core/framework/cpu_allocator_impl.cc:82] Allocation of 314572800 exceeds 10% of free system memory.Calibration: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [02:37<00:00, 9.55entries/s][info] Stats Collector is done (completion time is 00:02:38.38)[2023-07-17 10:05:55,764][acceleras][INFO] - Stats Collector is done (completion time is 00:02:38.38)[info] Bias Correction skipped[2023-07-17 10:06:08,818][acceleras][INFO] - Bias Correction skipped[info] Adaround skipped[2023-07-17 10:06:08,821][acceleras][INFO] - Adaround skipped[info] Starting Fine Tune[2023-07-17 10:06:08,822][acceleras][INFO] - Starting Fine TuneEpoch 1/62023-07-17 10:07:10.005476: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:903] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inSelectV2_2-2-TransposeNHWCToNCHW-LayoutOptimizererror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdeviceerror: Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice2023-07-17 10:07:16.352546: W tensorflow/core/framework/op_kernel.cc:1733] UNKNOWN: JIT compilation failed.Error executing job with overrides: []Traceback (most recent call last): File "convert.py", line 18, in main convert_obj.optimizer_har() File "convert.py", line 79, in optimizer_har self.runner.optimize(self.calib_dataset) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func return func(self, *args, kwargs) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/runner/client_runner.py", line 1783, in optimize self._optimize(calib_data, data_type=data_type, work_dir=work_dir) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func return func(self, *args, *kwargs) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/runner/client_runner.py", line 1671, in _optimize self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir, File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 869, in full_quantization self._full_acceleras_run() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1019, in _full_acceleras_run optimization_flow.run() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 101, in run self.post_quantization_optimization() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 129, in post_quantization_optimization self._finetune() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 265, in finetune , results = finetune.run() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/algorithms/algorithm_base.py", line 119, in run self._run_int() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/algorithms/finetune/qft.py", line 301, in _run_int self.run_qft(self._model_native, self._model, metrics=self.metrics) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/algorithms/finetune/qft.py", line 358, in run_qft qft_distiller.fit(self.train_dataset, verbose=1, File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler raise e.with_traceback(filtered_tb) from None File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,tensorflow.python.framework.errors_impl.UnknownError: Graph execution error: Detected at node 'Adam/mod' defined at (most recent call last): File "convert.py", line 94, in
main() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hydra/main.py", line 94, in decorated_main _run_hydra( File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra _run_app( File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hydra/_internal/utils.py", line 457, in _run_app run_and_report( File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hydra/_internal/utils.py", line 220, in run_and_report return func() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hydra/_internal/utils.py", line 458, in args, kwargs) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/runner/client_runner.py", line 1783, in optimize self._optimize(calib_data, data_type=data_type, work_dir=work_dir) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func return func(self, *args, *kwargs) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/runner/client_runner.py", line 1671, in _optimize self._sdk_backend.full_quantization(calib_data, data_type=data_type, work_dir=work_dir, File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 869, in full_quantization self._full_acceleras_run() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_client/sdk_backend/sdk_backend.py", line 1019, in _full_acceleras_run optimization_flow.run() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 101, in run self.post_quantization_optimization() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 129, in post_quantization_optimization self._finetune() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/flows/optimization_flow.py", line 265, in finetune , results = finetune.run() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/algorithms/algorithm_base.py", line 119, in run self._run_int() File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/algorithms/finetune/qft.py", line 301, in _run_int self.run_qft(self._model_native, self._model, metrics=self.metrics) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/algorithms/finetune/qft.py", line 358, in run_qft qft_distiller.fit(self.train_dataset, verbose=1, File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler return fn(args, **kwargs) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/engine/training.py", line 1409, in fit tmp_logs = self.train_function(iterator) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/engine/training.py", line 1051, in train_function return step_function(self, iterator) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/engine/training.py", line 1040, in step_function outputs = model.distribute_strategy.run(run_step, args=(data,)) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/engine/training.py", line 1030, in run_step outputs = model.train_step(data) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/acceleras/model/distiller.py", line 109, in train_step self.optimizer.apply_gradients(zip(gradients_f, trainable_vars_f)) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 672, in apply_gradients apply_state = self._prepare(var_list) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 992, in _prepare self._prepare_local(var_device, var_dtype, apply_state) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/optimizers/optimizer_v2/adam.py", line 130, in _prepare_local super(Adam, self)._prepare_local(var_device, var_dtype, apply_state) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 998, in _prepare_local lr_t = tf.identity(self._decayed_lr(var_dtype)) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/keras/optimizers/optimizer_v2/optimizer_v2.py", line 1056, in _decayed_lr lr_t = tf.cast(lr_t(local_step), var_dtype) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_model_optimization/algorithms/finetune/qft.py", line 58, in call step = step % self.steps_per_epochNode: 'Adam/mod'JIT compilation failed. [[{{node Adam/mod}}]] [Op:__inference_train_function_682919]lambda: hydra.run( File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 119, in run ret = run_job( File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "convert.py", line 18, in main convert_obj.optimizer_har() File "convert.py", line 79, in optimizer_har self.runner.optimize(self.calib_dataset) File "/local/workspace/hailo_virtualenv/lib/python3.8/site-packages/hailo_sdk_common/states/states.py", line 16, in wrapped_func return func(self, — Reply to this email directly, view it on GitHub https://github.com/hailo-ai/hailo_model_zoo/issues/60, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBIQYHIXUI6OM5SPCOQ6Y3XQUN5XANCNFSM6AAAAAA2MZLIBM . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Yes, I have installed the device drivers and I can correctly load and run the same script with optimization level 0; without fine-tuning and it works perfectly. I can also run the example demo for yolov7 using the optimized model.
Additionally, tensorflow installed in the container can see the gpu on the machine.
Hi Sarthak, I specfically meant the Nvidia packages:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \&& curl -s -L \https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker
בתאריך יום ג׳, 18 ביולי 2023 ב-11:00 מאת Sarthak Trivedi < @.***>:
Yes, I have installed the device drivers and I can correctly load and run the same script with optimization level 0; without fine-tuning and it works perfectly. I can also run the example demo https://github.com/hailo-ai/Hailo-Application-Code-Examples for yolov7 using the optimized model.
— Reply to this email directly, view it on GitHub https://github.com/hailo-ai/hailo_model_zoo/issues/60#issuecomment-1639714341, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADBIQYAMUYZJUOJRKZUJPY3XQY7C3ANCNFSM6AAAAAA2MZLIBM . You are receiving this because you commented.Message ID: @.***>
-- Regards, Nadav Eden
Yes, that is already installed. TensorFlow can access the GPU from inside the docker container.
hmm.. What is the GPU model?
Okay; I tried downgrading the docker image from 2023.04 to 2022.10 and the fine-tuning works; Is it an internal bug with the Dataflow compiler library ? I'm using an RTX 3090.
The docker image that you're reffering is the suite? If so, does it works on the latest 2023.07?
Yes I'm referring to the software suite docker image. I'll try it with 2023.07 and let you know.
Hi,
I'm trying to convert YOLOv7 model trained on Crowdhuman dataset to HEF. I followed the optimization tutorial ; I'm trying to optimize the network with optimization level set to 2 with all the other options set according to the alls files provided in hailo model zoo for YOLOv7. I'm using the latest Hailo Software Suite - Docker image. I get the following error: