HTTPStatusError: Client error '404 Not Found' when loading benchmark

emotionor commented 3 days ago

Description:

I encountered an issue when trying to load a benchmark using the polaris library. The following code snippet produces an HTTPStatusError:

import polaris as po

benchmark = po.load_benchmark("polaris/hello_world_benchmark")

Error Message:

HTTPStatusError: Client error '404 Not Found' for url 'https://polarishub.io/api/v1/benchmark/polaris/hello_world_benchmark'
Traceback (most recent call last):
  File ~/miniconda3/lib/python3.11/site-packages/polaris/hub/client.py, line 136, in PolarisHubClient._base_request_to_hub
    response.raise_for_status()
  ...
  File ~/miniconda3/lib/python3.11/site-packages/httpx/_models.py, line 761, in response.raise_for_status
    raise HTTPStatusError(message, request=request, response=self)

Environment:

Python version: 3.11
Polaris version: 0.5.0
Operating System: Linux

Additional Information:

It seems like the URL https://polarishub.io/api/v1/benchmark/polaris/hello_world_benchmark might be incorrect or the benchmark is not available on the server. In addition to that, I've tried other datasets such as polaris/adme-fang-1, and they all come up with the same 404 error. img_v3_02c7_d580a811-a1db-4c09-be20-61e85d083b5g

cwognum commented 2 days ago

Hi @emotionor, thanks for reporting!

I can reproduce the issue and am looking into it.

cwognum commented 2 days ago

Actually @emotionor, it seems the code example you provided me with has a subtle typo!

Instead of hello_world_benchmark, it should be hello-world-benchmark.

This works for me:

import polaris as po

benchmark = po.load_benchmark("polaris/hello-world-benchmark")

Some sort of fuzzy search (e.g. "hello_world_benchmark does not exist, did you mean hello-world-benchmark?") would perhaps be nice?

For the Fang dataset, this works for me (notice we're using load_dataset, not load_benchmark).

dataset  = po.load_dataset("polaris/adme-fang-1")

Let me know if that solves it.

cwognum commented 2 days ago

It seems we use hello_world_benchmark in the documentation. That's something we should fix! Thanks for reporting!

cwognum commented 2 days ago

We updated the docs in #117 and have a related PR to update the code examples on the Hub to use existing benchmarks. Let me know if that helps, @emotionor!

robotcator commented 2 days ago

Hi, @cwognum can you explain the split method when using get_train_test_split after load_benchmark?

emotionor commented 2 days ago

@cwognum Thank you very much for your assistance in resolving the issue. The solution you provided worked perfectly. I appreciate your help and the great work you do in maintaining this project.

cwognum commented 2 days ago

Hi, @cwognum can you explain the split method when using get_train_test_split after load_benchmark?

Hi @robotcator , happy to answer your question, but this is not the best place to ask. Would you mind posing the question in our Discord or in a Github Discussion? That way, it's easier for others with similar questions to learn from your question as well! Thanks!

polaris-hub / polaris

HTTPStatusError: Client error '404 Not Found' when loading benchmark #115