huggingface / nanotron

Minimalistic large language model 3D-parallelism training
Apache License 2.0
1.14k stars 107 forks source link

[Bug] `TypeError: Config.__init__() [...]` from `examples/config_tiny_llama.py` #35

Closed saforem2 closed 8 months ago

saforem2 commented 8 months ago

Fixed by #33

For reference:

I was getting TypeError: Config.init() missing 1 required positional argument: profiler' when trying to run examples/config_tiny_llama.py.

Explicitly:

$ python3 examples/config_tiny_llama.py
Model has 16p4K parameters
Traceback (most recent call last):
  File "/lus/grand/projects/datascience/foremans/locations/polaris/projects/saforem2/nanotron/examples/config_tiny_llama.py", line 90, in <module>
    config = Config(
TypeError: Config.__init__() missing 1 required positional argument: 'profiler'
[1]    30778 exit 1     python3 examples/config_tiny_llama.py
2.78s user 5.53s system 231% cpu 3.590s total

the issue is coming from this line in the Config object in src/nanotron/config/config.py:

https://github.com/huggingface/nanotron/blob/main/src/nanotron/config/config.py#L334

Setting the Optional[ProfilerArgs] = None by default (as shown below) fixes this:

@dataclass
class Config:
    # [...]
    profiler: Optional[ProfilerArgs] = None
thomwolf commented 8 months ago

thanks for the issue, Setting the Optional[ProfilerArgs] = None by default prevent subclassing the Config in other training scripts though, which is usefull to add features without making nanotron too big.

So best will probably be to add profiler = None in examples/config_tiny_llama.py if you can open a PR for this. Cc @NouamaneTazi

saforem2 commented 8 months ago

ahh okay, yeah that makes sense. that was actually what I did here initially before reverting it.

@thomwolf, @NouamaneTazi this should be resolved following these two commits:

NouamaneTazi commented 8 months ago

Thanks for the issue & fix 🤗