neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

Assertion `!cache_sizes.empty()' failed #1372

Closed akarym-sl closed 10 months ago

akarym-sl commented 10 months ago

Describe the bug Running the benchmark on AWS t2.micro instance with or without Docker yields the following error:

arch.bin: ./src/include/cpu_info/detect_arch_json.hpp:240: std::string cpu_info::to_summary_json(const cpu_info::detected_arch_t&): Assertion `!cache_sizes.empty()' failed.

Traceback (most recent call last): File "/usr/local/bin/deepsparse.benchmark", line 5, in from deepsparse.benchmark.benchmark_model import main File "/usr/local/lib/python3.9/site-packages/deepsparse/init.py", line 33, in from .engine import File "/usr/local/lib/python3.9/site-packages/deepsparse/engine.py", line 28, in from deepsparse.benchmark import BenchmarkResults File "/usr/local/lib/python3.9/site-packages/deepsparse/benchmark/init.py", line 19, in from .ort_engine import File "/usr/local/lib/python3.9/site-packages/deepsparse/benchmark/ort_engine.py", line 52, in ARCH = cpu_architecture() File "/usr/local/lib/python3.9/site-packages/deepsparse/cpu.py", line 163, in cpu_architecture arch = _parse_arch_bin() File "/usr/local/lib/python3.9/site-packages/deepsparse/cpu.py", line 51, in call self.memo[args] = self.f(*args) File "/usr/local/lib/python3.9/site-packages/deepsparse/cpu.py", line 136, in _parse_arch_bin error = json.loads(ex.stdout) File "/usr/local/lib/python3.9/json/init.py", line 346, in loads return _default_decoder.decode(s) File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/local/lib/python3.9/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Expected behavior Expected to output the results of the benchmark. I wanted to exactly reproduce the 7-8x speedup stated on the NeuralMagic website. However, decided to first run my docker container on the free t2.micro instance.

Environment

  1. OS: Amazon Linux
  2. Python version: 3.9.18
  3. DeepSparse version or commit hash [e.g. 0.1.0, f7245c8]: 1.5.3
  4. ML framework version(s) [e.g. torch 1.7.1]: ?
  5. Other Python package versions [e.g. SparseML, Sparsify, numpy, ONNX]: ONNX 1.12.0
  6. CPU info - output of deepsparse/src/deepsparse/arch.bin or output of cpu_architecture() as follows: N/A

To Reproduce

  1. Create a docker image using this Dockerfile:
    
    FROM python:3.9.18-slim-bullseye
    USER root
    WORKDIR /app

RUN pip install deepsparse[onnxruntime]

COPY entrypoint.sh /app/ RUN chmod 777 /app/entrypoint.sh

ENTRYPOINT ["/app/entrypoint.sh"] CMD ["run"]


2. Run it on AWS EC2 t2.micro instance
tlrmchlsmth commented 10 months ago

Hi @akarym-sl, thank you for your bug report -- we have a fix for this and I'll let you know once it's available in deepsparse-nightly

tlrmchlsmth commented 10 months ago

Hi @akarym-sl, this should now be fixed in the latest nightly