`benchmark-present` prints cuda related warning

stanford-crfm / helm

Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).

https://crfm.stanford.edu/helm

Apache License 2.0

1.89k stars 244 forks source link

`benchmark-present` prints cuda related warning #984

Closed percyliang closed 1 year ago

percyliang commented 1 year ago

Running benchmar-present takes a long time to load and prints out this warning:

/home/pliang/benchmarking/venv/lib/python3.8/site-packages/torch/cuda/__init__.py:83: UserWarning: HIP initialization: Unexpected error from hipGetDeviceCount(). Did you run some cuda functions before calling NumHipDevices() that might have already set an error? Error 101: hipErrorInvalidDevice (Triggered internally at  ../c10/hip/HIPFunctions.cpp:110.)
  return torch._C._cuda_getDeviceCount() > 0

Everything after it works fine, but it would be nice to not have this error.

yifanmai commented 1 year ago

Do you have the CLI command that produced this warning?

percyliang commented 1 year ago

benchmark-present (no arguments)

yifanmai commented 1 year ago

I wasn't able to reproduce this on main (5fa4ffe) on a GPU-less machine. Maybe it is environment dependent? I ran:

source venv/bin/activate
pip uninstall crfm-benchmarking
pip install .
benchmark-run

yifanmai commented 1 year ago

It's worth revisiting this after #1000; it's possible that some other dependency is causing the pytorch error.

percyliang commented 1 year ago

I'm not getting this error after #1000.