Closed seanxuu closed 8 months ago
[INFO|configuration_utils.py:802] 2024-01-29 15:32:49,297 >> Model config XverseConfig { "_name_or_path": "models/XVERSE-13B-256K", "architectures": [ "XverseForCausalLM" ], "auto_map": { "AutoConfig": "configuration_xverse.XverseConfig", "AutoModelForCausalLM": "modeling_xverse.XverseForCausalLM" }, "bos_token_id": 2, "eos_token_id": 3, "hidden_act": "silu", "hidden_size": 5120, "initializer_range": 0.02, "intermediate_size": 13824, "max_position_embeddings": 32768, "max_tokenizer_truncation": 262144, "model_type": "xverse", "num_attention_heads": 40, "num_hidden_layers": 40, "pad_token_id": 1, "rms_norm_eps": 1e-06, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.36.2", "use_cache": false, "vocab_size": 100534 }
[INFO|modeling_utils.py:3341] 2024-01-29 15:32:49,425 >> loading weights filemodels/XVERSE-13B-256K/pytorch_model.bin.index.json [INFO|modeling_utils.py:1341] 2024-01-29 15:32:49,426 >> Instantiating XverseForCausalLM model under default dtype torch.bfloat16. [INFO|configuration_utils.py:826] 2024-01-29 15:32:49,427 >> Generate config GenerationConfig { "bos_token_id": 2, "eos_token_id": 3, "pad_token_id": 1, "use_cache": false }
I find a way to solve it: https://github.com/xverse-ai/XVERSE-13B/issues/27#issuecomment-1907907
Reminder
Reproduction
Expected behavior
Bug report
System Info
transformers
version: 4.36.2Others
No response