-
### System Info
Hi there, thank you for your perfect project,
I see that you have `--master-addr |--master-port ` parameter when run the server
Do you have any guide about use torch distributed in …
-
It's very frustrating that a lot of messages get written to stderr, like model parameters that are very difficult to differentiate from errors. I tried to capture stderr but then also error messages g…
-
Hello,
There are many instances were you would like to use a custom trained LLM model.
GCP and Vertex AI allow a verry easy and straightforward way of generating a tunned version of a published…
Azyl updated
2 months ago
-
Benchmarking `deepseek-llm:67b-chat`, `mistral:latest`, `mixtral:latest`, & `llama2:13b` on [query classification](https://github.com/Shopify/reasonableai/commit/2a49d6fe4a240900396eb4b2b665d3454855d3…
-
### Problem Description
When running llama.cpp's server example on ROCm, using an RDNA3 GPU, GPU usage is shown as 100% and a high power consumption is measured at the wall outlet, even with the serv…
-
Darwin Feedloops-Mac-Studio-2.local 23.3.0 Darwin Kernel Version 23.3.0: Wed Dec 20 21:31:00 PST 2023; root:xnu-10002.81.5~7/RELEASE_ARM64_T6020 arm64
command: python -m llama_cpp.server --model ./…
-
Hello,
I have a bug when using models from the transformers package that automatically add special tokens in the tokenizer, for instance :
```python
lm = models.Transformers("Open-Orca/Mistral…
-
The command `python3 torchchat.py where llama3` fails quietly presumably because I might not have the HF Token configured.
I assumed the code was broken, though because I got a backtrace of the pr…
-
First of all: Your extension is awesome, thanks for all your effort in making it better constantly! 👍🏼
FIM doesn't work for [Mistral-7B-Instruct-v0.2-code-ft](https://huggingface.co/Nondzu/Mistral-…
-
Hi, thank you for the amazing model! Super excited to test it out!
I am trying to load it into my GeForce RTX 3090 (24 GB vRAM), which I believe to be more than enough for inference with 8 or 4 bit…