-
### System Info
A100 40GB
RAM 32GB
### Who can help?
@ArthurZucker, @younesbelkada
### Information
- [x] The official example scripts
- [ ] My own modified scripts
### Tasks
- [ ]…
-
### What happened?
I can't make json_mode work with deepinfra through litellm, but it works just fine if I use deepinfra directly via the requests library.
Below a small snippet to reproduce t…
-
### Problem Description
Hi,
When doing text generation with Mistral 7b with Hugginface transformers on a MI100 GPU, I can see in the collected torch trace that a lot of time is wasted due a hipMem…
-
Hi All,
I am trying to run an inference server on Ollama using the below script:
ollama run mistral:v0.3
Then running h2o-gpt using the below script:
python generate.py --guest_name='' --b…
-
**Describe the bug**
I tried running deepspeed zero 3 on a new huggingface model and got the following error:
[2023-12-13 04:12:18,837] [WARNING] [parameter_offload.py:86:_apply_to_tenso…
-
Dear authors of VideoLLaMA2,
Thanks for the great work. We tried to reproduce your results on vllava datasets using the latest version of the code. However, we observe a large discrepancy in the thre…
-
# Expected Behavior
I use llama-cpp-python on a non-GPU system and on a AMD GPU 6650 on Linux (POP OS 22.04).This report is for the AMD GPU system. The non-GPU system outputs results fine. The AMD G…
-
**Problem**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Request from user:
https://build.nvidia.com/explore/discover
so this is something I came…
-
list model i get is
{
"models": [
{
"datasetName": null,
"datasetUrl": null,
"description": "Command R+ is Cohere's latest LLM and is the first op…
-
Hi team,
Thank you for the great work. I tried to replicate the part for using mistral as planner and I noticed that in [tool_agent.py](https://github.com/OSU-NLP-Group/TravelPlanner/blob/main/agen…