-
I merged a mistal 8x7b model with the lora adapter, and I save the .pt with torch.save(model.state_dict(), 'path_to_model.pt')
However, when I use vllm to inference on the new merged model, I fai…
-
**Describe the bug**
非常痛苦 动态shape根本转不出来
**To Reproduce**
```python
import nncase
import numpy as np
import onnx
import onnxsim
# from nncase_base_func import model_simplify, read_model_fil…
-
Running ``quantize.py`` with ``--mode int4-gptq`` does not seem to work:
- code tries to import ``lm-evaluation-harness`` which is not included/documented/used
- import in ``eval.py`` is incorrect…
-
https://github.com/InternLM/lmdeploy/issues/1991#issue-2402071158 问题跟这里类似,问一些比较短的问题能正常输出,但是问一些比较长的问题(超过1万字,不超过session-len里面设置的长度)结果返回空。
模型:qwen1.5-7b-chat
启动脚本:lmdeploy serve api_server qwen1half-7b…
-
Hello,
I am new to the rampage seq analysis and I am having a few problems. My bed files were generated using rampage_peaks.sh. However, when I run idr on those files I get the following message. I t…
-
I want to use vllm and the model amazon/FalconLite2 which can be found https://huggingface.co/amazon/FalconLite2 for benchmarking throughput and latencies. However, the model is not supported by vllm.…
-
### Feature request
Hyperparameter search is a must known activity to get hyperparameters which are for optimised machine learning or deep learning model output. I was trying hyperparameter_search…
-
I am using the latest vllm docker image, trying to run Mixtral 8x7b model quantized in AWQ format. I got error message as below:
```
INFO 12-24 09:22:55 llm_engine.py:73] Initializing an LLM engine …
-
On commit 7e259b8 from PR #817 the test case `prusti-tests/tests/verify/pass/arrays/selection_sort.rs` is reliably failing due to a timeout error during the CI run on Ubuntu. This timeout issue is not…
-
Hi,
we are facing the following problem with kallisto 0.44 for paired-end data, but it does not seem to be specific to being paired-end (see below):
- [quant] processed 56,725,603 reads, 20,911,8…