neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.94k stars 169 forks source link

[server] Add `model` argument to server cli #1584

Closed dsikka closed 5 months ago

dsikka commented 5 months ago

Summary:

deepsparse.server \
 "zoo:llama2-7b-ultrachat200k_llama2_pretrain-pruned50_quantized" \
 --integration openai

PR Update

This PR allows us to get the following cli command:

deepsparse.server --integration openai "hf:mgoin/TinyStories-1M-ds"

We can run the following commands with the current set-up:

deepsparse.server --task text_generation --model_path "hf:mgoin/TinyStories-1M-ds"
deepsparse.server --model_path "hf:mgoin/TinyStories-1M-ds" --integration openai
deepsparse.server --task text_generation "hf:mgoin/TinyStories-1M-ds"
deepsparse.server --intergration openai --task text_generation  "hf:mgoin/TinyStories-1M-ds"
deepsparse.server --config_file ~/debugging/sample_config.yaml

Caveats (@bfineran):

Shoutout to @rahul-tuli for his click knowledge and help

mgoin commented 5 months ago

is this a breaking change if we were using --model_path first? this seems fairly important for all server flows and we should hopefully be able to deprecate rather than removing

dsikka commented 5 months ago

@mgoin yes. If we make model_path an argument to support the UX docs, we can't use it as an option as well ---model_path

We could add a separate model path entry point but click can't support both AFAIK

bfineran commented 5 months ago

@mgoin yes. If we make model_path an argument to support the UX docs, we can't use it as an option as well ---model_path

We could add a separate model path entry point but click can't support both AFAIK

@dsikka per @markurtz lets add a model_path kwarg back in (can rename the positional arg to something else) and allow it to override the positional if given. (Would need to make the positional arg optional in this case I guess)