RDFLib / prez

Prez is a data-configurable Linked Data API framework that delivers profiles of Knowledge Graph data according to the Content Negotiation by Profile standard.
BSD 3-Clause "New" or "Revised" License
21 stars 8 forks source link

Better error message for missing SPARQL_ENDPOINT variable #260

Closed ashleysommer closed 2 weeks ago

ashleysommer commented 2 weeks ago

If you forget to provide a value for the SPARQL_ENDPOINT variable (either through env var or in the config file), Prez will start and during startup it will emit an error like this:

2024-08-27 22:30:03.029 [INFO] prez.services.app_service: Checking SPARQL endpoint None is online
ERROR:azure.functions.AsgiMiddleware:Failed ASGI startup with message 'Traceback (most recent call last):
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/starlette/routing.py", line 732, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/starlette/routing.py", line 608, in __aenter__
    await self._router.startup()
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/starlette/routing.py", line 709, in startup
    await handler()
  File "/home/som05d/CODE/RDF/prez/prez/app.py", line 104, in app_startup
    await healthcheck_sparql_endpoints()
  File "/home/som05d/CODE/RDF/prez/prez/services/app_service.py", line 36, in healthcheck_sparql_endpoints
    response = httpx.get(
               ^^^^^^^^^^
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/httpx/_api.py", line 198, in get
    return request(
           ^^^^^^^^
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/httpx/_api.py", line 106, in request
    return client.request(
           ^^^^^^^^^^^^^^^
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/httpx/_client.py", line 814, in request
    request = self.build_request(
              ^^^^^^^^^^^^^^^^^^^
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/httpx/_client.py", line 345, in build_request
    url = self._merge_url(url)
          ^^^^^^^^^^^^^^^^^^^^
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/httpx/_client.py", line 375, in _merge_url
    merge_url = URL(url)
                ^^^^^^^^
  File "/home/som05d/CODE/RDF/prez/.venv/lib/python3.11/site-packages/httpx/_urls.py", line 119, in __init__
    raise TypeError(
TypeError: Invalid type for url.  Expected str or httpx.URL, got <class 'NoneType'>: None
'.

Notice it gets all the way into the depths of the httpx library before anyone notices the url is None.

It would be easy to add a check on startup (before calling healthcheck_sparql_endpoints()) to ensure SPARQL_ENDPOINT is not None, and emit a better error to the user for how to fix it.

edmondchuc commented 2 weeks ago

Probably should remove the Optional typing and remove the default None value in the pydantic settings[1].

https://github.com/RDFLib/prez/blob/c5d39319418cb7442c97562fc0babe2e4c773c61/prez/config.py#L33

At least that way, it'll fail on startup when the config object is populated and no value is found for the env var.

[1] https://docs.pydantic.dev/latest/concepts/pydantic_settings/

ashleysommer commented 2 weeks ago

@edmondchuc Sounds like a sensible and pydantic way to do it to me.

recalcitrantsupplant commented 2 weeks ago

It's optional as the sparql_repo_type can be "pyoxigraph" in which case no sparql endpoint is required. We can do a slightly more complex validation on startup though to check if sparql_repo_type = "remote" then a sparql_endpoint must be specified

edmondchuc commented 2 weeks ago

Ah, that's why it's optional! I forgot about the different backend types.