-
mentioned in #108. Currently we don't have an inference api, like the `pipeline` from huggingface transformers. Right now you need to manually load the model/tokenizer, apply them on the input data, a…
-
### Motivation
你好。我看到文档中支持offline inference模式下,得到input logprob。请问api server部署方式下支持吗?如果不支持,请问近期会有plan吗?
### Related resources
#2041
### Additional context
_No response_
-
In case it's helpful for others using these tests, I solved an error in the Serverless client with our [Inference](https://github.com/elastic/elasticsearch-clients-tests/blob/main/tests/inference/10_b…
-
Hi all!
Here an instruction how to integrate Groq API with Verba.
Obtain API from [](https://console.groq.com/login)
1. `pip install groq`
2. Create "GroqGenerator.py" at goldenverba/compon…
-
### Description
Large documents need to be chunked otherwise tokens exceeding the model's limit won't be used.
MVP
- Use a sliding window approach
- Chunk into 200 words
- Try splitting on wh…
-
### Description
The inference API doesn't expose some of the same query parameters that the ml trained models API to provide management of asynchronous tasks. It would be helpful for users if the A…
-
### Discussed in https://github.com/danielmiessler/fabric/discussions/544
Originally posted by **xJohnWhite** June 4, 2024
# Summary
Using an inside-my-network OpenAI API-compatible inferenc…
-
Attempts to call `YggdrasilModel.predict` in a Node.js REPL yield this exception: `[ERR_INVALID_REPL_INPUT]: Listeners for uncaughtException cannot be used in the REPL`. We're using version `0.0.2` of…
-
File "/home/huyi/anaconda3/envs/tts/lib/python3.11/site-packages/gradio/queueing.py", line 532, in process_events
response = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^…
-
Are there docs on best practices for using vllm hosted models?
I create a model using
python -m vllm.entrypoints.openai.api_server --model model_path
and try running it as
lm_eval --model lo…