Closed Y-T-G closed 1 year ago
Hey @Y-T-G, thanks for the bug report. We can reproduce this issue and have a fix for it. I'll ping you once it's available in deepsparse-nightly
A fix will be available the next time a nightly release goes out, and I'll close the issue then!
@tlrmchlsmth Cool. Thanks for the fix.
When can I expect the nightly to be available?
Hello @Y-T-G
The latest nightly has been mounted. Please now try pip install deepsparse-nightly
- THANK YOU! 🥇
Jeannie / Neural Magic
Hi @Y-T-G here is a Colab notebook showing how to export the ONNX and run it on deepsparse-nightly: https://colab.research.google.com/drive/16r8fLUgAEqPWbDlQmgrmmuq8WvFXxLnQ?usp=sharing
@jeanniefinks @mgoin Thanks. I will try it out.
Thanks for sharing @Y-T-G , very cool project! Let me know if you'd be interested in sparsifying the model for more performance
@mgoin Sure. That would be great. I was wondering how to improve the FPS further.
@Y-T-G Here is a quick colab notebook I made using T4 GPU to one-shot sparsify/quantize the model. https://colab.research.google.com/drive/1DLB-tE1ide-55b9gzq6kQyrrW0lvT7xj?usp=sharing
It uses our Sparsify tool (in alpha right now, so leave feedback!) to optimize the ONNX with some (dummy) calibration data: https://github.com/neuralmagic/sparsify
Here is the low-sparsity ONNX: https://drive.google.com/file/d/1qMZCtikHtS4Edy0EBP9R7qX5i9eLyOvz/view?usp=sharing Here is the high-sparsity ONNX: https://drive.google.com/file/d/1XkVYhX4SJfM0mLRuH-RIx6F0xQ9-vAYy/view?usp=drive_link
I used dummy data so the model likely isn't accurate, but you can substitute real input data to maintain.
If you want to talk more on this, happy to jump on a call or join our slack to ask question: https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ
Base model benchmark:
Sparsified model benchmark:
Describe the bug I am trying to compile the ONNX model for YOLO-NAS in Google Colab.
Expected behavior The model should compile.
Environment Include all relevant environment information:
f7245c8
]: 1.5.2{'L1_data_cache_size': 32768, 'L1_instruction_cache_size': 32768, 'L2_cache_size': 262144, 'L3_cache_size': 57671680, 'architecture': 'x86_64', 'available_cores_per_socket': 1, 'available_num_cores': 1, 'available_num_hw_threads': 2, 'available_num_numa': 1, 'available_num_sockets': 1, 'available_sockets': 1, 'available_threads_per_core': 2, 'bf16': False, 'cores_per_socket': 1, 'dotprod': False, 'i8mm': False, 'isa': 'avx2', 'num_cores': 1, 'num_hw_threads': 2, 'num_numa': 1, 'num_sockets': 1, 'threads_per_core': 2, 'vbmi': False, 'vbmi2': False, 'vendor': 'GenuineIntel', 'vendor_id': 'Intel', 'vendor_model': 'Intel(R) Xeon(R) CPU @ 2.20GHz', 'vnni': False, 'zen1': False}
To Reproduce Exact steps to reproduce the behavior:
Generate random sample input
inputs = generate_random_inputs(onnx_filepath, batch_size)
Compile and run
engine = compile_model(onnx_filepath, batch_size)
{"pid":7,"type":"jupyter","level":40,"msg":"DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.5.2 COMMUNITY | (93c38382) (release) (optimized) (system=avx2, binary=avx2)","time":"2023-08-07T02:35:50.156Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.5.2 (93c38382) (release) (optimized) (system=avx2, binary=avx2)","time":"2023-08-07T02:35:51.124Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Date: ","time":"2023-08-07T02:35:51.124Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"08-07-2023 @ 02:35:51 UTC","time":"2023-08-07T02:35:51.124Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"OS: ","time":"2023-08-07T02:35:51.125Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Linux 4960402abf9e 5.15.109+ #1 SMP Fri Jun 9 10:57:30 UTC 2023","time":"2023-08-07T02:35:51.125Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Arch: x86_64","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"CPU: GenuineIntel","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Vendor: Intel","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Cores/sockets/threads: [1, 1, 2]","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Available cores/sockets/threads: [1, 1, 2]","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"L1 cache size data/instruction: 32k/32k","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"L2 cache size: 0.25Mb","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"L3 cache size: 55Mb","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Total memory: 12.6784G","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Free memory: 8.24567G","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Assertion at src/lib/engine/execution/layouts/greedy_assign_layouts.cpp:495","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Backtrace:","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 0# wand::detail::abort_prefix(std::ostream&, char const, char const, int, bool, bool, unsigned long) in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 1# wand::detail::assert_fail(char const, char const, int) in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 2# 0x00007DA04967948A in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 3# 0x00007DA04967A6C9 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 4# 0x00007DA04967A85C in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 5# 0x00007DA0490BCF86 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 6# 0x00007DA0490C18DC in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 7# 0x00007DA0490C709E in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 8# 0x00007DA04901E46A in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":" 9# 0x00007DA04900AD73 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"10# 0x00007DA048DBC400 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"11# wand::engine::compiler::compiler::plan_execution_graph(boost::adjacency_list<boost::multisetS, boost::listS, boost::bidirectionalS, wand::engine::execution::data_descriptor, wand::engine::execution::graph_edge, boost::no_property, boost::listS> const&) const in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"12# wand::engine::compiler::compiler::compile(boost::adjacency_list<boost::multisetS, boost::listS, boost::bidirectionalS, wand::engine::execution::data_descriptor, wand::engine::execution::graph_edge, boost::no_property, boost::listS> const&) const in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"13# wand::engine::compiler::compiler::compile(wand::engine::compute::compute_graph const&) const in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"14# wand::engine::compiler::compiler::compile(wand::engine::intake::graph const&) const in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"15# 0x00007DA048278854 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"16# 0x00007DA04826C6FF in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"17# 0x00007DA04825740C in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"18# 0x00007DA048258561 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"19# 0x00007DA048916E40 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"20# 0x00007DA04891C27E in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"21# 0x00007DA04891EDD7 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"22# 0x00007DA04891F204 in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.126Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"23# 0x00007DA04817EE9E in /usr/local/lib/python3.10/dist-packages/deepsparse/avx2/libonnxruntime.so.1.12.0","time":"2023-08-07T02:35:51.127Z","v":0} {"pid":7,"type":"jupyter","level":40,"msg":"Please email a copy of this stack trace and any additional information to: support@neuralmagic.com","time":"2023-08-07T02:35:51.127Z","v":0}