neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.97k stars 171 forks source link

docker access denied error #1452

Closed krrishdholakia closed 5 months ago

krrishdholakia commented 9 months ago

Describe the bug I'm trying to pull the docker image but hitting an access denied error

Expected behavior i would expect docker pull ghcr.io/neuralmagic/deepsparse:1.5.1 to work

Environment I'm on a mac.

Running: docker pull ghcr.io/neuralmagic/deepsparse:1.5.1

mgoin commented 9 months ago

Hi @krrishdholakia that docker pull command works on my M1 MacBook. Any further context you can provide about your environment?

Screenshot 2023-12-04 at 5 34 40 PM

I did uncover an issue with the latest nightly docker though, so we will be looking into that:

➜  ~ docker pull ghcr.io/neuralmagic/deepsparse-nightly
Using default tag: latest
latest: Pulling from neuralmagic/deepsparse-nightly
no matching manifest for linux/arm64/v8 in the manifest list entries

EDIT: It seems we do not have docker image available for ARM. Please try installing deepsparse-nightly using PyPi instead

krrishdholakia commented 9 months ago

@mgoin i'm trying to get a local server running, so i can add support for it via litellm. How do i do that with the pip package?

pip install deepsparse-nightly


  --task sentiment-analysis \
  --model_path zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/pruned90_quant-none```

<img width="825" alt="Screenshot 2023-12-09 at 10 57 49 AM" src="https://github.com/neuralmagic/deepsparse/assets/17561003/9aadd52b-9327-4252-89b3-375083f801d9">
krrishdholakia commented 9 months ago
Screenshot 2023-12-09 at 10 59 22 AM
mgoin commented 9 months ago

Hi @krrishdholakia what version of python do you have? We support python 3.8-3.11 as of 1.6 stable or nightly.

If you want to use the server and transformers, you need to install those extras - as in pip install -U deepsparse-nightly[server,transformers].

Here is a colab notebook as an example: https://colab.research.google.com/drive/1Ng10jwBLUs81SDzZLE9P8G-q8D2YyKeL?usp=sharing

krrishdholakia commented 9 months ago

thanks for the colab @mgoin i have 3.11

Screenshot 2023-12-09 at 12 11 18 PM
krrishdholakia commented 9 months ago

@mgoin any suggestions for how i can actually use / test the server from the colab, i was thinking about running it via ngrok, but not sure how i could wrap it.

Screenshot 2023-12-09 at 12 13 11 PM
mgoin commented 8 months ago

Hey @krrishdholakia I wouldn't recommend trying to host a server from colab since it isn't a supported flow from Google.

You should be able to run that notebook locally from your macbook since we have native ARM MacOS support in the deepsparse release, I just shared the colab as an example of an environment working without docker.

Please let me know if you need a docker image built for ARM, otherwise please go the pip install -U deepsparse-nightly[server,llm] route in your local python environment.

We have docker images ready if you are on an x86 machine - here is an example running on windows docker:

Server:

docker run -p 5543:5543 -it ghcr.io/neuralmagic/deepsparse-nightly:20231220 deepsparse.server --task text-generation --integration openai --model_path hf:mgoin/llama2.c-stories15M-ds

Screenshot 2023-12-20 195557

Client:

curl http://localhost:5543/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer dummy" \
  -d '{
    "model": "hf:mgoin/llama2.c-stories15M-ds",
    "messages": "Once upon a time"
  }'

Screenshot 2023-12-20 195611

jeanniefinks commented 5 months ago

Hi @krrishdholakia As some time has passed with no further updates, I am going to go ahead and close out this issue. Please re-open if you want to continue the conversation. Best, Jeannie / Neural Magic