Open sudhir-mcw opened 1 month ago
The likely cause is that you have not run install-heim-extras.sh
as explained in the HEIM docs; could you try that and see if that fixes things?
Sorry that this was not clearly explained in the documentation. I've updated the documentation to make things more clear.
Hi @yifanmai, Thanks for the reply. I tried once again after installing the install-heim-extras.sh, The process gets interrupted with the following error
AestheticsMetric() {
Parallelizing computation on 1 items over 4 threads {
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00, 8.58s/it] } [8.579s]██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00, 8.58s/it] } [8.58s] CLIPScoreMetric(multilingual=False) { Parallelizing computation on 1 items over 4 threads { 0%| | 0/1 [00:00<?, ?it/s] } [0.002s] | 0/1 [00:00<?, ?it/s] } [0.002s] } [14.125s] } [6m14.466s] Error when running mscoco:model=huggingface_stable-diffusion-v1-4: Traceback (most recent call last): File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/runner.py", line 216, in run_all self.run_one(run_spec) File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/runner.py", line 307, in run_one metric_result: MetricResult = metric.evaluate( File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/metrics/metric.py", line 143, in evaluate results: List[List[Stat]] = parallel_map( File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/common/general.py", line 235, in parallel_map results = list(tqdm(executor.map(process, items), total=len(items), disable=None)) File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/site-packages/tqdm/std.py", line 1181, in iter for obj in iterable: File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator yield fs.pop().result() File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/_base.py", line 439, in result return self.get_result() File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/_base.py", line 391, in get_result raise self._exception File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/thread.py", line 58, in run result = self.fn(*self.args, self.kwargs) File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/metrics/metric.py", line 77, in process self.metric.evaluate_generation( File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/metrics/image_generation/clip_score_metrics.py", line 58, in evaluate_generation prompt = WindowServiceFactory.get_window_service(model, metric_service).truncate_from_right(prompt) File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/window_services/window_service_factory.py", line 17, in get_window_service model_deployment: Optional[ModelDeployment] = get_model_deployment(model_deployment_name) File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/model_deployment_registry.py", line 130, in get_model_deployment raise ValueError(f"Model deployment {name} not found") ValueError: Model deployment openai/clip-vit-large-patch14** not found
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [06:14<00:00, 374.49s/it]
} [6m21.356s]
Traceback (most recent call last):
File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/bin/helm-run", line 8, in
It's runnning fine till aesthetic metrics, it gets stopped at clip score calculation, Are there any configuration I am missing on?
I'm able to reproduce this myself. @teetone would you know what's happening here?
HI @teetone I am trying to try out heim with the following command and was facing the issue from heim documentation
HuggingFaceDiffusersClient error: Failed to import diffusers.pipelines.stable_diffusion because of the following error (look up to see its traceback): 'Config' object has no attribute 'define_bool_state' Request failed. Retrying (attempt #2) in 10 seconds... (See above for error details)
File "helm/src/helm/benchmark/window_services/window_service_factory.py", line 17, in get_window_service model_deployment: Optional[ModelDeployment] = get_model_deployment(model_deployment_name) File "helm/src/helm/benchmark/model_deployment_registry.py", line 132, in get_model_deployment raise ValueError(f"Model deployment {name} not found") ValueError: Model deployment openai/clip-vit-large-patch14 not found
0%| | 0/1 [00:35<?, ?it/s] } [37.279s] Traceback (most recent call last): File "helm/src/helm/benchmark/run.py", line 380, in
main()
File "helm/src/helm/common/hierarchical_logger.py", line 104, in wrapper
return fn(*args, **kwargs)
File "helm/src/helm/benchmark/run.py", line 351, in main
run_benchmarking(
File "helm/src/helm/benchmark/run.py", line 128, in run_benchmarking
runner.run_all(run_specs)
File "helm/src/helm/benchmark/runner.py", line 226, in run_all
raise RunnerError(f"Failed runs: [{failed_runs_str}]")
helm.benchmark.runner.RunnerError: Failed runs: ["mscoco:model=huggingface_stable-diffusion-v1-4"]
Here is information on my setup conda env Python 3.9.20 I installed heim using the build from source instead of using
pip
, as pip version was taking quite a long time to resolve the dependencies Here are the steps i used to installI checked the community forum and tried replacing jax version to latest as well, but still no luck
Are there any other installation and quick start documentation related to heim apart from heim.md in the docs ?