facebookresearch / FAMBench

Benchmarks to capture important workloads.
Apache License 2.0
28 stars 23 forks source link

XLM-R failing 'omegaconf._utils' has no attribute 'is_primitive_type' #86

Closed amathews-amd closed 2 years ago

amathews-amd commented 2 years ago

we are noticing a new failure:

Traceback (most recent call last):
  File "xlmr/ootb/xlmr.py", line 178, in <module>
    run()
  File "xlmr/ootb/xlmr.py", line 142, in run
    xlmr = get_model()
  File "xlmr/ootb/xlmr.py", line 29, in get_model
    fairseq_xlmr_large = torch.hub.load('pytorch/fairseq:main', 'xlmr.large')
  File "/opt/conda/lib/python3.7/site-packages/torch/hub.py", line 399, in load
    model = _load_local(repo_or_dir, model, *args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/hub.py", line 428, in _load_local
    model = entry(*args, **kwargs)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/roberta/model_xlmr.py", line 44, in from_pretrained
    **kwargs,
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/hub_utils.py", line 75, in from_pretrained
    arg_overrides=kwargs,
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/checkpoint_utils.py", line 421, in load_model_ensemble_and_task
    state = load_checkpoint_to_cpu(filename, arg_overrides)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/checkpoint_utils.py", line 339, in load_checkpoint_to_cpu
    state = _upgrade_state_dict(state)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/checkpoint_utils.py", line 677, in _upgrade_state_dict
    state["cfg"] = convert_namespace_to_omegaconf(state["args"])
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/dataclass/utils.py", line 405, in convert_namespace_to_omegaconf
    with omegaconf_no_object_check():
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/dataclass/utils.py", line 367, in __init__
    self.old_is_primitive = _utils.is_primitive_type
AttributeError: module 'omegaconf._utils' has no attribute 'is_primitive_type'
xuzhao9 commented 2 years ago

This is because omegaconf recently upgraded and changed its API, see upstream fix: https://github.com/facebookresearch/fairseq/pull/4440

amathews-amd commented 2 years ago

Thanks @xuzhao9 @erichan1 , can you move XLM-R as a submodule that can be imported into FAMBench, so we can get upstream changes ?

xuzhao9 commented 2 years ago

@amathews-amd just reminder that upstream hasn't merged the fix yet. So maybe it is better to pin omegaconf version to an older version.

erichan1 commented 2 years ago

@amathews-amd Yes, I'd suggest pinning omegaconf version for now. With my current bandwidth it'll take me a while to shift from pulling the model from torch hub to switching to fairseq.

Edit: I'm actually not sure exactly how torch hub works. Fixes might upstream to torch hub on a nightly basis? If that's not the case I stick with omegaconf version pinning.

amathews-amd commented 2 years ago

fixed by pinning hydracore pip install hydra-core==1.1.2