Closed srimouli04 closed 1 year ago
model_type
should not appear in the file config.json
created by CTranslate2. You don't need to add this parameter.
Can you post the exact steps to reproduce the error. In particular, what conversion command did you use?
Hey @guillaumekln
This is the command I used the below command. I downloaded the model to my local, and then I used the model for conversion.
ct2-transformers-converter --model <model_path>/falcon-40b-instruct --quantization float16 --output_dir falcon-40b-instruct --trust_remote_code
But without model_type, the transformers.AutoTokenizer.from_pretrained throws a key error. which reads like
raise ValueError(
ValueError: Unrecognized model in <model_path>/falcon-40b-instruct. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: albert, align, altclip, audio-spectrogram-transformer, autoformer, bart, beit, bert, bert-generation, big_bird, bigbird_pegasus, biogpt, bit, blenderbot, blenderbot-small, blip, blip-2, bloom, bridgetower, camembert, canine, chinese_clip, clap, clip, clipseg, codegen, conditional_detr, convbert, convnext, convnextv2, cpmant, ctrl, cvt, data2vec-audio, data2vec-text, data2vec-vision, deberta, deberta-v2, decision_transformer, deformable_detr, deit, deta, detr, dinat, distilbert, donut-swin, dpr, dpt, efficientformer, efficientnet, electra, encoder-decoder, ernie, ernie_m, esm, flaubert, flava, fnet, focalnet, fsmt, funnel, git, glpn, gpt-sw3, gpt2, gpt_bigcode, gpt_neo, gpt_neox, gpt_neox_japanese, gptj, gptsan-japanese, graphormer, groupvit, hubert, ibert, imagegpt, informer, jukebox, layoutlm, layoutlmv2, layoutlmv3, led, levit, lilt, llama, longformer, longt5, luke, lxmert, m2m_100, marian, markuplm, mask2former, maskformer, maskformer-swin, mbart, mctct, mega, megatron-bert, mgp-str, mobilebert, mobilenet_v1, mobilenet_v2, mobilevit, mobilevitv2, mpnet, mt5, mvp, nat, nezha, nllb-moe, nystromformer, oneformer, open-llama, openai-gpt, opt, owlvit, pegasus, pegasus_x, perceiver, pix2struct, plbart, poolformer, prophetnet, qdqbert, rag, realm, reformer, regnet, rembert, resnet, retribert, roberta, roberta-prelayernorm, roc_bert, roformer, rwkv, sam, segformer, sew, sew-d, speech-encoder-decoder, speech_to_text, speech_to_text_2, speecht5, splinter, squeezebert, swiftformer, swin, swin2sr, swinv2, switch_transformers, t5, table-transformer, tapas, time_series_transformer, timesformer, timm_backbone, trajectory_transformer, transfo-xl, trocr, tvlt, unispeech, unispeech-sat, upernet, van, videomae, vilt, vision-encoder-decoder, vision-text-dual-encoder, visual_bert, vit, vit_hybrid, vit_mae, vit_msn, wav2vec2, wav2vec2-conformer, wavlm, whisper, xclip, xglm, xlm, xlm-prophetnet, xlm-roberta, xlm-roberta-xl, xlnet, xmod, yolos, yoso
And this is the same code I used as described in the ctranslate2 documentation.
import ctranslate2
import transformers
generator = ctranslate2.Generator("<model_path>/falcon-40b-instruct", device="cuda")
tokenizer = transformers.AutoTokenizer.from_pretrained("<model_path>/falcon-40b-instruct")
prompt = (
"Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. "
"Giraftron believes all other animals are irrelevant when compared to the glorious majesty."
"of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:"
)
tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt))
results = generator.generate_batch([tokens], sampling_topk=10, max_length=500, include_prompt_in_result=False)
output = tokenizer.decode(results[0].sequences_ids[0])
print(output)
Hi @srimouli04, I am not sure you are doing it properly, first this line :
generator = ctranslate2.Generator("<model_path>/falcon-40b-instruct", device="cuda")
should be like this :
generator = ctranslate2.Generator("./falcon-40b-instruct", device="cuda")
The generator should be fill with the output_dir
of the ct2-transformers-converter command. Not the default hugging face repo/download
For the tokenizer, I think you can use :
tokenizer = transformers.AutoTokenizer.from_pretrained("tiiuae/falcon-40b-instruct")
As the tokenizer will be the same as the instruct one, you can even fill it with "tiiuae/falcon-40b"
Hi @jgcb00
Apologies for the confusion. I'm actually using the converted model path only, as you mentioned in your comment. But still, I have an issue that raises the ValueError if the model_Type isn't available in the config.json file. I have tried using the tokenizer by downloading it from HF and moving it to the directory which has the translated model.
Can you delete the downloaded model and download it again from HF?
Also make sure to not set --output_dir
to the same directory as the original model.
Hi @guillaumekln
I have followed the steps you mentioned but I still face the same error.
Can you post the content of the file config.json
from the original model directory (i.e. <model_path>/falcon-40b-instruct
in your conversion command)?
This is the config file from the original model directory. This is the repo I have been using https://huggingface.co/tiiuae/falcon-40b-instruct/tree/main
{
| "alibi": false,
| "apply_residual_connection_post_layernorm": false,
| "architectures": [
| "RWForCausalLM"
| ],
| "attention_dropout": 0.0,
| "auto_map": {
| "AutoConfig": "configuration_RW.RWConfig",
| "AutoModelForCausalLM": "modelling_RW.RWForCausalLM"
| },
| "bias": false,
| "bos_token_id": 11,
| "eos_token_id": 11,
| "hidden_dropout": 0.0,
| "hidden_size": 8192,
| "initializer_range": 0.02,
| "layer_norm_epsilon": 1e-05,
| "model_type": "RefinedWeb",
| "n_head": 128,
| "n_head_kv": 8,
| "n_layer": 60,
| "parallel_attn": true,
| "torch_dtype": "bfloat16",
| "transformers_version": "4.26.0",
| "use_cache": true,
| "vocab_size": 65024
| }
Ok, I think I understand now what you are doing. You are loading the tokenizer from the converted model directory, but you should load it from the original model. Something like this:
generator = ctranslate2.Generator("/path/to/converted/model", device="cuda")
tokenizer = transformers.AutoTokenizer.from_pretrained("/path/to/original/model")
I have downloaded the model from HF and converted the Falcon-40b-instruct using ctranslate2. But when I try to run the model I get two errors
Any pointers on how this can be fixed?