Closed EladWarshawsky closed 2 weeks ago
I have the same issue.
The listed models are "proof-of-concept" at most, they are not meant for real use, someone would need to train a new model from beginning with BitNet in mind for coherent output.
The listed models are "proof-of-concept" at most, they are not meant for real use, someone would need to train a new model from beginning with BitNet in mind for coherent output.
Yeah I guess 100B tokens isn't enough to even have autocomplete. That sucks. I will see about training it potentially.
As others have said, this isn't a problem with BitNet.
Problem: The model generates repetitive, nonsensical outputs like "Breis" regardless of the input provided. This happens even with different generation settings (e.g., temperature, top_k, top_p).
from transformers import AutoModelForCausalLM, AutoTokenizer
Load the model and tokenizer from Hugging Face
model = AutoModelForCausalLM.from_pretrained("1bitLLM/bitnet_b1_58-3B") Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
from transformers import AutoConfig
Load model configuration
config = AutoConfig.from_pretrained("1bitLLM/bitnet_b1_58-3B") print(config) LlamaConfig { "_name_or_path": "1bitLLM/bitnet_b1_58-3B", "architectures": [ "BitnetForCausalLM" ], "attention_bias": false, "attention_dropout": 0.0, "bos_token_id": 1, "eos_token_id": 2, "head_dim": 100, "hidden_act": "silu", "hidden_size": 3200, "initializer_range": 0.02, "input_bits": 8, "intermediate_size": 8640, "max_position_embeddings": 2048, "mlp_bias": false, "model_type": "llama", "num_attention_heads": 32, "num_hidden_layers": 26, "num_key_value_heads": 32, "pad_token_id": 32000, "pretraining_tp": 1, "rms_norm_eps": 1e-05, "rope_scaling": null, "rope_theta": 10000.0, "tie_word_embeddings": true, "torch_dtype": "float16", "transformers_version": "4.45.1", "use_cache": true, "vocab_size": 32002, "weight_bits": 1 }
from tokenization_bitnet import BitnetTokenizer
Load the custom tokenizer
tokenizer = BitnetTokenizer( vocab_file="bitnet_tokenizer/tokenizer.model", special_tokens_map_file="bitnet_tokenizer/special_tokens_map.json", tokenizer_config_file="bitnet_tokenizer/tokenizer_config.json", unk_token='',
bos_token='
', eos_token='', )Example input for inference
input_text = "?" inputs = tokenizer(input_text, return_tensors="pt")
Perform inference
outputs = model.generate(**inputs, max_length=20, temperature=0.7, top_k=50, top_p=0.9) print(tokenizer.decode(outputs[0]))
The output:
? suppis Breis Breis Breis Breis Breis Breis Breis Breis