-
用以下方式验证glm4-9b-chat模型的输出,serving端报错
curl --request POST \
--url http://127.0.0.1:8000/v1/chat/completions \
--header 'content-type: application/json' \
--data '{
"model": "glm-4-9…
-
I only modified t6 instead of t4, t4 t5 both work well for this model,but if we set the thread=6,will always trigger the problem on my XIAOMI14Pro(SM8650 8Gen3)
please check it for resolve
thanks~
…
-
### Describe the issue
Ask what version of pyautogen will support 'register_for_llm' later, because I'm using the local model chatGLM, needs openai float:
if base_currency == quote_currency:
…
-
**Title:** Automatically label medical data from diagnosis reports
**Project Lead:** Frank Langbein, frank@langbein.org
**Description:** We wish to automatically label medical diagnosis data (MRI,…
-
### Describe the bug
when providing an assistant ID for GPTAssistantAgent. the code pathway at line 117 always has a None value for variables "instructions" and "specified_tools". this is because the…
-
can ollama URL be configured to point to remote box?
or try use ssh tunnel to make remote ollama appear to be local
-
### System Info
I am experimenting with TRT LLM and `flan-t5` models. My simple goal is to build engines with different configurations and tensor parallelism, then review performance. Have a DGX syst…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I don't use huggingface because of proxy issue. So, I tried to make local embeddin…
-
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N…
-
**Is your feature request related to a problem? Please describe.**
we are exploring around using LaVague for accomplishing web automation but the limitation is using public facing models. can we supp…