-
### Your current environment
```Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubun…
-
### Your current environment
```text
The output of `python collect_env.py`
```
```
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTo…
-
I followed the documentation to build the LLaMA 3 8B Instruct model with multiple LoRA versions as described in this NVIDIA blog post(https://developer.nvidia.com/zh-cn/blog/deploy-multilingual-llms-w…
-
### Your current environment
```text
The output of `python collect_env.py`
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12…
-
### Your current environment
```text
# Using pip install vllm
vllm==v0.5.1
```
### 🐛 Describe the bug
```text
# My python script to test long text
def run_Mixtral():
tokenizer = A…
-
### 🐛 Describe the bug
When the datapipe iterator is reset, the multiprocessing reading service tries to pickle the datapipe (why?). In case the data pipe contains a buffer with file handles this fai…
-
### Context
End Of Sequence tokens are an essential part of LLM training and inference. You can find more details in [this comment](https://discuss.huggingface.co/t/how-does-gpt-decide-to-stop-gene…
-
### Your current environment
```text
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: RED OS release MUROM (7.3.4) Stan…
-
Please add support for this model. https://github.com/vikhyat/moondream
An extra idea which may be feasible or unfeasible (I do not know) is maybe speculative decoding using a smaller model like th…
-
### My environment setup
1st environment (running on ec2 `g6.4xlarge`)
```
[2024-06-01T10:14:23Z] Collecting environment information...
[2024-06-01T10:14:26Z] PyTorch version: 2.3.0+cu121
[2024-0…
khluu updated
1 month ago