Closed khalil-Hennara closed 9 months ago
Hello, Thank you for reaching out about AceGPT. We haven't specifically tested it on Google Colab, but we're here to help. Could you please share the error messages you're facing with AceGPT on Colab?
Regarding the model precision, we're transitioning from the current fp32 version to an fp16 version to facilitate easier usage and download.
Best regards,
Thanks for your response, I didn't get any error, because the model didn't even upload, I can share the code I used to upload the model within 4bits, the same code I used to load Llama2 7B and 13B
`from torch import cuda, bfloat16 import transformers
model_id = 'FreedomIntelligence/AceGPT-13B-chat'
device = f'cuda:{cuda.current_device()}' if cuda.is_available() else 'cpu'
bitsandbytes
librarybnb_config = transformers.BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=bfloat16 )
hf_auth = 'token from hugging face' model_config = transformers.AutoConfig.from_pretrained( model_id, use_auth_token=hf_auth )
model = transformers.AutoModelForCausalLM.from_pretrained( model_id, trust_remote_code=True, config=model_config, quantization_config=bnb_config, device_map='auto', use_auth_token=hf_auth ) model.eval() print(f"Model loaded on {device}")`
the same code work for llama, the token usage is also for llama Thanks in advance.
Please try our int4 model quanted by Auto-GPTQ. For some reason maybe about the package version, this model can't be used in online HF directly, but it still works with git clone. In Colab, you can try as below:
!pip install transformers==4.32.0
!pip install sentencepiece
!pip3 install auto-gptq==0.4.2
!git clone https://huggingface.co/FreedomIntelligence/AceGPT-7b-chat-GPTQ
from transformers import AutoTokenizer
from auto_gptq import AutoGPTQForCausalLM
model_id = 'AceGPT-7b-chat-GPTQ'
model = AutoGPTQForCausalLM.from_quantized(model_id,use_safetensors=False)
tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side="right", use_fast=False)
prompt_dict = {
'AceGPT': """[INST] <<SYS>>\nأنت مساعد مفيد ومحترم وصادق. أجب دائما بأكبر قدر ممكن من المساعدة بينما تكون آمنا. يجب ألا تتضمن إجاباتك أي محتوى ضار أو غير أخلاقي أو عنصري أو جنسي أو سام أو خطير أو غير قانوني. يرجى التأكد من أن ردودك غير متحيزة اجتماعيا وإيجابية بطبيعتها.\n\nإذا كان السؤال لا معنى له أو لم يكن متماسكا من الناحية الواقعية، اشرح السبب بدلا من الإجابة على شيء غير صحيح. إذا كنت لا تعرف إجابة سؤال ما، فيرجى عدم مشاركة معلومات خاطئة.\n<</SYS>>\n\n""",
}
role_dict = {
'AceGPT':['[INST]','[/INST]'],
}
def format_message(query, max_src_len):
return f"""{prompt_dict["AceGPT"]}{query} {role_dict["AceGPT"][1]}"""
temperature=0.5
max_new_tokens = 768
content_len = 2048
message = 'أين هي عاصمة المملكة العربية السعودية'
history = []
max_src_len = content_len-max_new_tokens-8
prompt = format_message(message, max_src_len)
model_inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**model_inputs)
output.shape
tokenizer.decode(output[0])
Thanks a lot its work very well, thanks for your time and support
I want to ask how to use the AceGPT within Colab, I've try to use many ways but nothing works, I've run LLama2 on colab either 7B or 13B using 4bits quantization, but didn't work with AceGPT, I don't know what is the problem, because AceGPT is a fine-tuned version from llama2 please if you could provide a notebook or a code to run the model on colab. The last question why the model have been saved within float32.
Thanks in advance