stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
https://crfm.stanford.edu/helm
Apache License 2.0
1.77k stars 235 forks source link

helm-critique breaking change #1887

Closed msaroufim closed 9 months ago

msaroufim commented 9 months ago

Repro

Gonna pin to an older version to avoid this but I noticed there were indeed some recent changes made here that disapper if i go to an older version (the one with the decode implemented for neurips client)

From here https://github.com/llm-efficiency-challenge/neurips_llm_efficiency_challenge

pip install git+https://github.com/stanford-crfm/helm.git
helm-run --conf-paths run_specs_full_coarse_600_budget.conf --suite v1

Logs

Traceback (most recent call last):
  File "/opt/conda/envs/helm/bin/helm-run", line 5, in <module>
    from helm.benchmark.run import main
  File "/opt/conda/envs/helm/lib/python3.10/site-packages/helm/benchmark/run.py", line 19, in <module>
    from .executor import ExecutionSpec
  File "/opt/conda/envs/helm/lib/python3.10/site-packages/helm/benchmark/executor.py", line 9, in <module>
    from helm.proxy.services.server_service import ServerService
  File "/opt/conda/envs/helm/lib/python3.10/site-packages/helm/proxy/services/server_service.py", line 22, in <module>
    from helm.proxy.clients.auto_client import AutoClient
  File "/opt/conda/envs/helm/lib/python3.10/site-packages/helm/proxy/clients/auto_client.py", line 21, in <module>
    from helm.proxy.critique.critique_client import CritiqueClient
ModuleNotFoundError: No module named 'helm.proxy.critique'
yifanmai commented 9 months ago

Thanks for the report.

@JosselinSomervilleRoberts could you take a look at this?

JosselinSomervilleRoberts commented 9 months ago

Hey @msaroufim, could you please see if #1888 solved it? I was not able to reproduce the error on my side. Thanks for reporting the bug!