Support for model trained by OLMo?

Hi, it seems that unsloth currently does not support loading base model trained by OLMo. Is it possible to write custom script to load the model into unsloth? The model architecture is shown below, and it is also using the "pre-layernorm" transformer architecture.

{
  "architectures": [
    "OlmoForCausalLM"
  ],
  "attention_bias": false,
  "attention_dropout": 0.0,
  "clip_qkv": null,
  "eos_token_id": 0,
  "hidden_act": "silu",
  "hidden_size": 2048,
  "initializer_range": 0.02,
  "intermediate_size": 8192,
  "max_position_embeddings": 2048,
  "model_type": "olmo",
  "num_attention_heads": 16,
  "num_hidden_layers": 16,
  "num_key_value_heads": 16,
  "pad_token_id": 1,
  "rope_scaling": null,
  "rope_theta": 10000.0,
  "tie_word_embeddings": true,
  "torch_dtype": "float32",
  "transformers_version": "4.42.3",
  "use_cache": true,
  "vocab_size": 51200
}

unslothai / unsloth

Support for model trained by OLMo? #774