-
I tried this using different llamafiles, but it always goes like this:
```
[alex@Arch bin]$ sh mixtral-8x7b-instruct-v0.1.Q5_K_M.llamafile -ngl 9999
import_cuda_impl: initializing gpu module...
ge…
-
Thanks for your great work! Do you have any plans to release the evaluation code for LLaVA-v1.5? Looking forward to your reply.
-
Hi!
**Doc request**
Fix the [LlavaForConditionalGeneration example](https://huggingface.co/docs/transformers/model_doc/llava#transformers.LlavaForConditionalGeneration.forward.example).
**Addit…
-
I've been having a hellish experience trying to get llama.cpp Python bindings to work for multiple GPUs. I have two RTX 2070s and Ubuntu OS, and I want to get llama.cpp performing inference using the …
y6t4 updated
5 months ago
-
Hi,
Have you tested your finetuned model on some QA datasets, which can be used to compare with some strong baseline models, such as BLIP-2?
-
### 🐛 Describe the bug
When I try to train model using torch.distributed.FullyShardedDataParallel, I found that :
when training using single-node multi-gpu (1x8A100), the training speed is normal.…
-
I try to let the model understand both picture and video, but there is a mistake. Obviously, what the video records is not the flag.
In my view, after share project layer, in the share feature s…
-
### OS
Microsoft Windows 10 Enterprise
Version 10.0.19045 Build 19045
from cmd shell:
```
wget https://huggingface.co/jartine/llava-v1.5-7B-GGUF/resolve/main/llava-v1.5-7b-q4-server.llamafile…
-
Im trying to replace HuggingFaceEmbeddingModel to OnnxEmbeddingModel cause i have ONNX model and there is no reason to cal HF and wait time to model load.
The model loaded from [https://huggingface.c…
-
hi all, thx for your work. I wonder how to finetune on user's self-made dataset like qwen-vl here: https://github.com/QwenLM/Qwen-VL/blob/master/README_CN.md#%E5%BE%AE%E8%B0%83
we hace tested the pre…
77h2l updated
6 months ago