XLM-R failing with ValueError: invalid literal for int() with base 10: '0a0'

amathews-amd commented 2 years ago

ValueError: invalid literal for int() with base 10: '0a0'

This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/next/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  **kwargs,
2022-07-07 04:27:14 | INFO | fairseq.tasks.multilingual_masked_lm | dictionary: 250001 types
Traceback (most recent call last):
  File "xlmr/ootb/xlmr.py", line 182, in <module>
    run()
  File "xlmr/ootb/xlmr.py", line 142, in run
    xlmr = get_model()
  File "xlmr/ootb/xlmr.py", line 29, in get_model
    fairseq_xlmr_large = torch.hub.load('pytorch/fairseq:main', 'xlmr.large')
  File "/opt/conda/lib/python3.7/site-packages/torch/hub.py", line 525, in load
    model = _load_local(repo_or_dir, model, *args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/hub.py", line 554, in _load_local
    model = entry(*args, **kwargs)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/roberta/model_xlmr.py", line 44, in from_pretrained
    **kwargs,
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/hub_utils.py", line 75, in from_pretrained
    arg_overrides=kwargs,
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/checkpoint_utils.py", line 473, in load_model_ensemble_and_task
    model = task.build_model(cfg.model, from_checkpoint=True)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/tasks/fairseq_task.py", line 676, in build_model
    model = models.build_model(args, self, from_checkpoint)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/roberta/model.py", line 237, in build_model
    encoder = RobertaEncoder(args, task.source_dictionary)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/roberta/model.py", line 553, in __init__
    self.sentence_encoder = self.build_encoder(args, dictionary, embed_tokens)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/roberta/model.py", line 570, in build_encoder
    encoder = TransformerEncoder(args, dictionary, embed_tokens)
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/transformer/transformer_encoder.py", line 433, in __init__
    return_fc=return_fc,
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/transformer/transformer_encoder.py", line 96, in __init__
    [self.build_encoder_layer(cfg) for i in range(cfg.encoder.layers)]
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/transformer/transformer_encoder.py", line 96, in <listcomp>
    [self.build_encoder_layer(cfg) for i in range(cfg.encoder.layers)]
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/transformer/transformer_encoder.py", line 438, in build_encoder_layer
    TransformerConfig.from_namespace(args),
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/models/transformer/transformer_encoder.py", line 107, in build_encoder_layer
    cfg, return_fc=self.return_fc
  File "/root/.cache/torch/hub/pytorch_fairseq_main/fairseq/modules/transformer_layer.py", line 131, in __init__
    + int(self.torch_version[2])

amathews-amd commented 2 years ago

This seems to be a workaround for now:

diff --git a/benchmarks/xlmr/ootb/xlmr.py b/benchmarks/xlmr/ootb/xlmr.py
index b0cb790..671e998 100644
--- a/benchmarks/xlmr/ootb/xlmr.py
+++ b/benchmarks/xlmr/ootb/xlmr.py
@@ -26,7 +26,7 @@ def time_ms(use_gpu):
     return time.time_ns() * 1e-6

 def get_model():
-    fairseq_xlmr_large = torch.hub.load('pytorch/fairseq:main', 'xlmr.large')
+    fairseq_xlmr_large = torch.hub.load('pytorch/fairseq:v0.12.0', 'xlmr.large')

     # TODO use torchscript? jit/script this model?
     return fairseq_xlmr_large

erichan1 commented 2 years ago

Yep, that should fix the versioning issue. Glad you found a fix @amathews-amd. I looked at the hub registration here https://github.com/facebookresearch/fairseq/blob/main/fairseq/models/roberta/model_xlmr.py and doesn't seem like it has changed. Not immediately sure what the problem is.

facebookresearch / FAMBench

XLM-R failing with ValueError: invalid literal for int() with base 10: '0a0' #90