-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue y…
-
### What happened + What you expected to happen
I am trying to embed 5gb of pdf's and other files. Then I send them into a vector database (milvus) It slowly builds up to using 1TB of object_store_me…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue y…
-
ValueError: Unknown quantization method: flute. Must be one of ['aqlm', 'awq', 'deepspeedfp', 'tpu_int8', 'fp8', 'fbgemm_fp8', 'modelopt', 'marlin', 'gguf', 'gptq_marlin_24', 'gptq_marlin', 'awq_marli…
-
e.g. using a Triton kernel with FakeTensor results in a CUDA illegal memory access: https://gist.github.com/zou3519/5a8d9872a9855ba9efeac82d516843ba
We should offer a better error message here.
cc…
-
### Describe the issue
I tried to LoRA finetune LLaVA-v1.6-34B with my own dataset(size 1000) and I constantly face tokenization mismatch during the process.
This is my script for LLaVA-v1.6-34B…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTor…
-
Here is highlevel tasks for WMMA enabling for GEMM:
- [x] Implement BlockedToWMMA in TritonAMDGPUAccelerateMatmulPass according to required layout (https://github.com/ROCmSoftwarePlatform/triton/pu…
-
## Description of bug / unexpected behavior
I took the following example from the VoiceOver Website:
```
class MyScene(VoiceoverScene):
def construct(self):
self.set_speech_…
-
`swift eval --eval_url xxx/v1/chat/completions --eval_dataset no --eval_is_chat_model true --model_type Qwen2-72B-Instruct-AWQ --custom_eval_config custom_config.json
`
File "/data/…