Closed kazuya-hodatsu-336-1 closed 2 weeks ago
Hello, Thank you for reaching out about the issues you encountered while training your model. We recommend using the official Hugging Face training code, as it has proven to be reliable and effective in our experience. We have successfully used the official Hugging Face training code and have achieved good performance results. It has consistently worked well for us, and we haven't encountered any significant issues... Thanks!
@ridgerchu Thanks for replying. I'll try it.
Could anyone provide me the links to the official Hugging Face training code? Or if possible can you share what worked @kazuya-hodatsu-336-1
@radna0 This 🤗 tutorial is my go-to resource when I want to train a model from scratch. It also has a link to the Colab notebook, which is pretty handy.
Just make sure to swap AutoConfig
with HGRNBitConfig
, and AutoModel
with AutoModelForCausalLM
:
from mmfreelm.models import HGRNBitConfig
from transformers import AutoModelForCausalLM
# Config for the 370M model
# Reference: https://huggingface.co/ridger/MMfreeLM-370M/blob/main/config.json
config_params = {
"attn_mode": "fused_recurrent",
"bos_token_id": 1,
"conv_size": 4,
"eos_token_id": 2,
"expand_ratio": 1,
"fuse_cross_entropy": True,
"hidden_act": "swish",
"hidden_ratio": 4,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": None,
"max_position_embeddings": 2048,
"model_type": "hgrn_bit",
"num_heads": 1,
"num_hidden_layers": 24,
"rms_norm_eps": 1e-06,
"share_conv_kernel": True,
"tie_word_embeddings": False,
"torch_dtype": "bfloat16",
"transformers_version": "4.40.2",
"use_cache": True,
"use_lower_bound": True,
"use_short_conv": False,
"vocab_size": 32000,
}
config = HGRNBitConfig(**config_params)
model = AutoModelForCausalLM.from_config(config)
Hello.
I tried to conduct training using the code below, but I kept encountering errors and couldn't get it to work.
Could you provide a sample training method to create a new model?
PyTorch version: 2.3.1+cu121 Triton version: 2.2.0 Einops version: 0.8.0
Error:
Am I doing something wrong to begin with? Sorry if I am making a big mistake.