-
### System Info
```Shell
- `Accelerate` version: 0.26.1
- Platform: Linux-6.1.58+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Numpy version: 1.26.3
- PyTorch version (GPU?): 2.1.2+cu121 (Tr…
-
이번 장에서는 극내 주식 데이터 중 주식티커와 섹터별 구성종목 및 퀀트 투자를 위한 핵심 데이터인 수정주가, 재무제표, 가치지표를 크롤링하는 방법을 알아보겠다.
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue y…
-
It seems to me that for now mlc is trying to loading all weight into one gpu card?
After convert_weight/gen_config/compile, it report error when ready to serve:
```
AssertionError: Cannot estimat…
-
### Your current environment
### Anything you want to discuss about vllm.
### Your current environment
The output of `python collect_env.py`
```text
PyTorch version: 2.4.0+cu121
Is debug…
-
**Describe the bug**
Just fine-tuned (full) the `florence-2-large-ft` model and now i can't run them.
# Command to reproduce
```txt
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir output/flore…
-
使用vllm0.4.3跑Qwen2-57B-A14B-Instruct-GPTQ-Int4模型时,直接报错,不知是vllm的问题还是Qwen2的问题,是否因为不支持量化Moe模型?
命令:python -m vllm.entrypoints.openai.api_server --model /data/models/Qwen2-57B-A14B-Instruct-GPTQ-Int4 --m…
-
Has anyone been able to get the LLaMA-2 70B model to run inference in 4-bit quantization using HuggingFace? Here are some variations of code that I've tried based on various guides:
```python3
nam…
-
I wonder how could that be possible.
I'd prefer to have an alpha channel rather than having to key the exported result.
I'm sure there's some ffmpeg set of options for this :)
-
I tried the minimum example from https://huggingface.co/Snowflake/snowflake-arctic-instruct and it did not work. Can you help me to fix it?
Im using the latest trasnformers release [commit](htt…