stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.85k stars 243 forks source link

Support Python 3.11 #1913

Open yifanmai opened 10 months ago

yifanmai commented 10 months ago

Currently HELM doesn't work on Python 3.11 because it depends on pyext, which breaks in Python 3.11 due to getargspec being removed from inspect in Python 3.11.

Collecting pyext~=0.7 (from crfm-helm==0.2.4)
  Using cached pyext-0.7.tar.gz (7.8 kB)
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'error'
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [23 lines of output]
      Traceback (most recent call last):
        File "/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/opt/hostedtoolcache/Python/3.11.5/x64/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-zaasvhf8/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 355, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-build-env-zaasvhf8/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-zaasvhf8/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 507, in run_setup
          super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-zaasvhf8/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 341, in run_setup
          exec(code, locals())
        File "<string>", line 6, in <module>
        File "/tmp/pip-install-pbvzkklw/pyext_c314d5f7141841b1a7d9fe52ee1bf8c9/pyext.py", line [117](https://github.com/stanford-crfm/helm/actions/runs/6514059109/job/17694688716#step:5:118), in <module>
          oargspec = inspect.getargspec
                     ^^^^^^^^^^^^^^^^^^
      AttributeError: module 'inspect' has no attribute 'getargspec'. Did you mean: 'getargs'?
      [end of output]
matthewdouglas commented 10 months ago

@yifanmai I was able to work around the issue with PyExt by creating a fork matthewdouglas/PyExt@13447e5.

However, there's more work needed, as helm-run tells me. This is fixed in huggingface/datasets#5238 so this dep would need to be bumped as well (>=2.7.0).


Traceback (most recent call last):
  File "/home/matt/helm/.venv/bin/helm-run", line 5, in <module>
    from helm.benchmark.run import main
  File "/home/matt/helm/.venv/lib/python3.11/site-packages/helm/benchmark/run.py", line 19, in <module>
    from helm.benchmark import vlm_run_specs  # noqa
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/matt/helm/.venv/lib/python3.11/site-packages/helm/benchmark/vlm_run_specs.py", line 6, in <module>
    from .run_specs import run_spec_function, get_exact_match_metric_specs
  File "/home/matt/helm/.venv/lib/python3.11/site-packages/helm/benchmark/run_specs.py", line 32, in <module>
    from .scenarios.lex_glue_scenario import (
  File "/home/matt/helm/.venv/lib/python3.11/site-packages/helm/benchmark/scenarios/lex_glue_scenario.py", line 5, in <module>
    import datasets
  File "/home/matt/helm/.venv/lib/python3.11/site-packages/datasets/__init__.py", line 47, in <module>
    from .builder import ArrowBasedBuilder, BeamBasedBuilder, BuilderConfig, DatasetBuilder, GeneratorBasedBuilder
  File "/home/matt/helm/.venv/lib/python3.11/site-packages/datasets/builder.py", line 91, in <module>
    @dataclass
     ^^^^^^^^^
  File "/usr/lib/python3.11/dataclasses.py", line 1230, in dataclass
    return wrap(cls)
           ^^^^^^^^^
  File "/usr/lib/python3.11/dataclasses.py", line 1220, in wrap
    return _process_class(cls, init, repr, eq, order, unsafe_hash,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/dataclasses.py", line 958, in _process_class
    cls_fields.append(_get_field(cls, name, type, kw_only))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/dataclasses.py", line 815, in _get_field
    raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'datasets.utils.version.Version'> for field version is not allowed: use default_factory```
yifanmai commented 1 month ago

scikit-images also needs to be upgraded to support Python 3.11. See #2858 for details.