Open learn01one opened 9 months ago
Seems like the issue is hydra version. I'm not which version it was, but try to use an older one.
hello,The code running environment is torch=2.2.0,python=3.11,Others are installed provided versions,Many similar questions arise,like ,ValueError: mutable default <class 'trainer.configs.configs.DebugConfig'> for field debug is not allowed: use default_factory,and ValueError: mutable default <class 'trainer.accelerators.deepspeed_accelerator.DeepSpeedConfig'> for field deepspeed is not allowed: use default_factory
It should be an environmental problem. Can you provide a more detailed operating environment? Thanks.
I was running with python 3.8
I don't have access to the env I ran with at the moment. If you continue to have this issue I'll help you debug it.
hello,All the problems before using python=3.8 have been solved. The training seems to be almost successful, just encountered a small problem.,like
fp16: MixedPrecisionConfig = MixedPrecisionConfig(enabled=True)
│ 23 │ bf16: MixedPrecisionConfig = MixedPrecisionConfig(enabled=False)
│ 24 │ optimizer: dict = field(default_factory=lambda: {
│ 25 │ │ "type": "AdamW",
TypeError: MixedPrecisionConfig() takes no arguments
Are there any additional parameters required during operation?thanks
Can you please show here the entire error trace? Also, what cmd you are running.
ok,running it by: accelerate launch --dynamo_backend no --gpu_ids all --num_processes 8 --num_machines 1 --use_deepspeed trainer/scripts/train.py +experiment=clip_h output_dir=output
the entire error trace like:
────────────────────────────────╮
│ /PickScore/trainer/scripts/train.py:13 in
Really strange.. I am able to run this code with no problem:
>>> from omegaconf import OmegaConf, MISSING, II
>>>
>>> from dataclasses import dataclass, field
>>>
>>> @dataclass
... class MixedPrecisionConfig:
... enabled: bool = MISSING
...
>>> @dataclass
... class DeepSpeedConfig:
... fp16: MixedPrecisionConfig = MixedPrecisionConfig(enabled=False)
... bf16: MixedPrecisionConfig = MixedPrecisionConfig(enabled=False)
...
>>>
With Python 3.10.13. Perhaps try to use 3.10?
hell0,Using python=3.10, the same problem still occurs. Can you show your deepspeed default_config.yaml?thanks
what does the deepspeed config has to do with it? Can you locally run the code that fails?
from omegaconf import OmegaConf, MISSING, II
from dataclasses import dataclass, field
@dataclass
class MixedPrecisionConfig:
enabled: bool = MISSING
@dataclass
class DeepSpeedConfig:
fp16: MixedPrecisionConfig = MixedPrecisionConfig(enabled=False)
bf16: MixedPrecisionConfig = MixedPrecisionConfig(enabled=False)
hello,can i ask the version of the python,i met the problem like ValueError: mutable default <class 'trainer.accelerators.base_accelerator.DebugConfig'> for field debug is not allowed: use default_factory