huggingface / lighteval

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
MIT License
462 stars 53 forks source link

Fix _init_max_length in base_model.py #185

Open gucci-j opened 1 month ago

gucci-j commented 1 month ago

What does this PR do?

This PR just fixes an error caused in self._init_max_length(config.max_length). I added try-except to avoid the error.

Error

AttributeError occurred while processing load_model() for bigscience/bloom-7b1 because self._tokenizer was not defined at this stage.

WARNING:lighteval.logging.hierarchical_logger:  Test all gather {
WARNING:lighteval.logging.hierarchical_logger:    Test gather tensor
WARNING:lighteval.logging.hierarchical_logger:    gathered_tensor tensor([0], device='cuda:0'), should be [0]
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.000649]
WARNING:lighteval.logging.hierarchical_logger:  Creating model configuration {
WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.000012]
WARNING:lighteval.logging.hierarchical_logger:  Model loading {
loading configuration file config.json from cache at /mnt/parscratch/users/acp23ay/private/hub/models--bigscience--bloom-7b1/snapshots/6232703e399354503377bf59dfbb8397fd569e4a/config.json
Model config BloomConfig {
  "_name_or_path": "bigscience/bloom-7b1",
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "BloomForCausalLM"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "bias_dropout_fusion": true,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "layer_norm_epsilon": 1e-05,
  "masked_softmax_fusion": true,
  "model_type": "bloom",
  "n_head": 32,
  "n_inner": null,
  "n_layer": 30,
  "offset_alibi": 100,
  "pad_token_id": 3,
  "pretraining_tp": 1,
  "skip_bias_add": true,
  "skip_bias_add_qkv": false,
  "slow_but_exact": false,
  "torch_dtype": "float16",
  "transformers_version": "4.39.0.dev0",
  "unk_token_id": 0,
  "use_cache": true,
  "vocab_size": 250880
}

WARNING:lighteval.logging.hierarchical_logger:  } [0:00:00.123938]
WARNING:lighteval.logging.hierarchical_logger:} [0:00:00.435923]
Traceback (most recent call last):
  File "/users/acp23ay/src/lighteval/run_evals_accelerate.py", line 82, in <module>
    main(args)
  File "/users/acp23ay/src/lighteval/src/lighteval/logging/hierarchical_logger.py", line 166, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/users/acp23ay/src/lighteval/src/lighteval/main_accelerate.py", line 77, in main
    model, model_info = load_model(config=model_config, env_config=env_config)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/users/acp23ay/src/lighteval/src/lighteval/models/model_loader.py", line 83, in load_model
    return load_model_with_accelerate_or_default(config=config, env_config=env_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/users/acp23ay/src/lighteval/src/lighteval/models/model_loader.py", line 125, in load_model_with_accelerate_or_default
    model = BaseModel(config=config, env_config=env_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/users/acp23ay/src/lighteval/src/lighteval/models/base_model.py", line 76, in __init__
    self._max_length = self._init_max_length(config.max_length)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/users/acp23ay/src/lighteval/src/lighteval/models/base_model.py", line 269, in _init_max_length
    if hasattr(self.tokenizer, "model_max_length"):
               ^^^^^^^^^^^^^^
  File "/users/acp23ay/src/lighteval/src/lighteval/models/base_model.py", line 103, in tokenizer
    return self._tokenizer
           ^^^^^^^^^^^^^^^
AttributeError: 'BaseModel' object has no attribute '_tokenizer'. Did you mean: 'tokenizer'?
NathanHB commented 1 month ago

looks good, thanks for the fix could you just adress the last comment ? I will merge afterward