tensorflow / tflite-micro

Infrastructure to enable deployment of ML models to low-power resource-constrained embedded targets (including microcontrollers and digital signal processors).
Apache License 2.0
1.82k stars 799 forks source link

Handling Sparse Networks using TFLM #2279

Closed Heatdh closed 10 months ago

Heatdh commented 11 months ago

Dear TFLM Team/ Community I would like to ask if you have experemented with a way to handle sparse networks for generating the kernels for a specific Target. Methods like unstructured pruning brings a decent speedup when it comes to infereing on consumer CPUs or neural engines like deepsparse that is optimized to handle the zeros thus skip the operation entirely. Unfortunately for structured pruning where I filter entire kernels that leads automatically to a smaller model, the performance gets affected badly especially that for the deployment on mcpu, the networks used are not deep. Quantization as a method of compressing the network harmed the performance drastically. Thank you in advance Rayen

github-actions[bot] commented 10 months ago

"This issue is being marked as stale due to inactivity. Remove label or comment to prevent closure in 5 days."

github-actions[bot] commented 10 months ago

"This issue is being closed because it has been marked as stale for 5 days with no further activity."