-
Does the codebase support 8-bit training similar to peft library?
I was trying to fine-tune on llama2-7b on 24Gb 4090 cards. Below is the error I got:
File "/home/nlp/JORA/examples/train.py", li…
yctam updated
3 months ago
-
-
I wanna reproduce the llama2 steps followed by the scripts/llama2_example.sh on RTX4090
I just run the commad
`python -m awq.entry --model_path /data/models/Llama-2-7b-chat-hf --w_bit 4 --q_group_s…
-
Sample:
https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/QLoRA-FineTuning/alpaca-qlora/finetune_llama2_7b_arc_2_card.sh
Env:
Intel(R) Xeon(R) w7-3455
2 ARC770
ubuntu22.…
-
### Feature request
We would like to request access to LLama-2 and implement it as a provided ML model, i.e ```openadapt.provider.LLAMA2``` or similar.
### Motivation
_No response_
-
Currently no performance optimizations have been included in the code. I started benchmarking the performance against HF transformers (which I think is the fair comparison for this project, and vs lla…
rbitr updated
8 months ago
-
这么好的开源模型非常感谢
请问本地私有化部署需要怎样的配置呢?需要注意些什么呢
https://huggingface.co/OpenBuddy/openbuddy-llama2-70b-v10.1-bf16
-
Hi.
I have been doing some benchmarks on nvidia V100 32GB gpu.
First, I benchmarked Llama2-7B-chat using huggingface transformers and CTranslate2. I saw reduced latency when using ct2 ( 12 secon…
-
# LightLLM运行过程
复现kvoff分支
##### 第一步:创建docker
拉取镜像:`docker pull ghcr.io/modeltc/lightllm:main`
llama-7b模型过大,在服务器的docker中直接clone总是发生网络中断,因此我将该模型下载到本地,通过Xftp传输到服务器中,而后在创建docker时将模型文件夹映射到lightl…
-
Hi @RaymondWang0 ,
I'm trying to implement this solution on windows(cpu) OS. And the pre-requisites have been met.
Model used is LLaMA2_7B_chat_awq_int4 --QM QM_x86
I'm getting error with 'make cha…