zazuko / trifid

Lightweight Linked Data Server and Proxy
Apache License 2.0
79 stars 11 forks source link

URI rewriting not working with Docker image #193

Closed mikel-egana-aranguren closed 5 months ago

mikel-egana-aranguren commented 10 months ago

Hi;

I'm using Trifid with Blazegraph in a Google Cloud machine:

The Blazegraph endpoint contains the following data:

<http://fair/data/amy-farrah-fowler> <http://schema.org/knows> <http://fair/data/mikel> .

And it works just fine:

Blayegraph

I execute Trifid with:

docker run -p 3031:8080 -e "SPARQL_ENDPOINT_URL=http://xxx.xxx.xx.xx:9999/blazegraph/namespace/um/sparql" -e "DATASET_BASE_URL=http://fair/" ghcr.io/zazuko/trifid

And although I can see Trifid working:

trifid

I get the following error and the Docker container simply dies:

shell

It looks like Trifid is trying to obtain a resource using dataset base URL, but it shouldn't. Perhaps the environment variables are not passed correctly to the Docker container?

Any ideas are wellcome

Thanks

Regards

PD: this is material for a Master level module on FAIR data publication

ktk commented 10 months ago

Thanks for the report, as I mentioned on LinkedIn we rewrite the inner workings of Trifid right now. The rewrite module used here is one of the reasons, it causes many problems and it's in its current form almost impossible to fix.

I will discuss the issue with my colleagues but my first guess is that you will have to wait for the new version to have this properly fixed.

mikel-egana-aranguren commented 10 months ago

Thanks. When are you expecting to release the new version? (I'm in a bit of a hurry since the classes are on the 28th of november - if you can pass me a prototype of the new version that works minimally I can test it as well)

ktk commented 10 months ago

That won't be ready by then I'm afraid.

What you could try is to replace blazegraph with Oxigraph (the binary version, not the WASM one we will intergrate), it's super lightweight. We never did much with blazegraph, might be that it's not behaving like we expect so it kills trifid at some point.

mikel-egana-aranguren commented 10 months ago

Ok thanks, I will try.

mikel-egana-aranguren commented 7 months ago

Hi again:

I have tried with Oxigraph (0.3.22) and Trifid (4.1.1), but it is behaving similarly: instead of querying the SPARQL endpoint, it is trying to resolve the mapped URI (http:example.org).

Docker-compose:

version: "3"
services:
   linked_data_server:
      image: ghcr.io/zazuko/trifid:v4.1.1
      ports:
         - "8080:8080"
      environment:
         SPARQL_ENDPOINT_URL: "http://sparql_endpoint:7878/query"
         DATASET_BASE_URL: "http://example.org/"

   sparql_endpoint:
      image: ghcr.io/oxigraph/oxigraph:v0.3.22
      volumes:
         - ./oxigraph/data:/data
      ports:
         - "7878:7878"

Data uploaded to Oxigraph:

@prefix ex: <http://example.org/> . 
ex:subject ex:predicate ex:object .

And when I access localhost:8080, not even trying to retrieve ex:subject, the trifid container dies with the following message:

linked_data_server_1  | WARNING: 'SPARQL_PROXY_CACHE_URL' environment variable is not set
linked_data_server_1  | [14:25:15.920] INFO (trifid-core/6): Trifid instance listening on: http://0.0.0.0:8080/
linked_data_server_1  | 172.26.0.1 - - [31/Jan/2024:14:25:27 +0000] "GET / HTTP/1.1" 304 - "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:122.0) Gecko/20100101 Firefox/122.0"
linked_data_server_1  | /app/node_modules/sparql-http-client/lib/checkResponse.js:7
linked_data_server_1  |   const err = new Error(`${res.statusText} (${res.status}): ${message}`)
linked_data_server_1  |               ^
linked_data_server_1  | 
linked_data_server_1  | Error: Not Found (404): <!doctype html>
linked_data_server_1  | <html>
linked_data_server_1  | <head>
linked_data_server_1  |     <title>Example Domain</title>
linked_data_server_1  | 
linked_data_server_1  |     <meta charset="utf-8" />
linked_data_server_1  |     <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
linked_data_server_1  |     <meta name="viewport" content="width=device-width, initial-scale=1" />
linked_data_server_1  |     <style type="text/css">
linked_data_server_1  |     body {
linked_data_server_1  |         background-color: #f0f0f2;
linked_data_server_1  |         margin: 0;
linked_data_server_1  |         padding: 0;
linked_data_server_1  |         font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
linked_data_server_1  |         
linked_data_server_1  |     }
linked_data_server_1  |     div {
linked_data_server_1  |         width: 600px;
linked_data_server_1  |         margin: 5em auto;
linked_data_server_1  |         padding: 2em;
linked_data_server_1  |         background-color: #fdfdff;
linked_data_server_1  |         border-radius: 0.5em;
linked_data_server_1  |         box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
linked_data_server_1  |     }
linked_data_server_1  |     a:link, a:visited {
linked_data_server_1  |         color: #38488f;
linked_data_server_1  |         text-decoration: none;
linked_data_server_1  |     }
linked_data_server_1  |     @media (max-width: 700px) {
linked_data_server_1  |         div {
linked_data_server_1  |             margin: 0 auto;
linked_data_server_1  |             width: auto;
linked_data_server_1  |         }
linked_data_server_1  |     }
linked_data_server_1  |     </style>    
linked_data_server_1  | </head>
linked_data_server_1  | 
linked_data_server_1  | <body>
linked_data_server_1  | <div>
linked_data_server_1  |     <h1>Example Domain</h1>
linked_data_server_1  |     <p>This domain is for use in illustrative examples in documents. You may use this
linked_data_server_1  |     domain in literature without prior coordination or asking for permission.</p>
linked_data_server_1  |     <p><a href="https://www.iana.org/domains/example">More information...</a></p>
linked_data_server_1  | </div>
linked_data_server_1  | </body>
linked_data_server_1  | </html>
linked_data_server_1  | 
linked_data_server_1  |     at checkResponse (/app/node_modules/sparql-http-client/lib/checkResponse.js:7:15)
linked_data_server_1  |     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
linked_data_server_1  |     at async ParsingQuery.select (/app/node_modules/sparql-http-client/StreamQuery.js:73:5)
linked_data_server_1  |     at async ParsingQuery.select (/app/node_modules/sparql-http-client/ParsingQuery.js:41:20)
linked_data_server_1  |     at async HttpInRDFHandler.queryRedirect (file:///app/node_modules/@zazuko/trifid-handle-redirects/index.js:95:22)
linked_data_server_1  |     at async HttpInRDFHandler.get (file:///app/node_modules/@zazuko/trifid-handle-redirects/index.js:117:22) {
linked_data_server_1  |   status: 404
linked_data_server_1  | }
linked_data_server_1  | 
linked_data_server_1  | Node.js v20.10.0
basquecountryinstitutionstransparentrelationsgraph_linked_data_server_1 exited with code 1

Thank you

Regards

ktk commented 7 months ago

We are close to a first release now, let's test that as soon as it is out.

mikel-egana-aranguren commented 7 months ago

Sure, please do let me know in this thread or in LinkedIn

Thanks

ktk commented 6 months ago

@mikel-egana-aranguren you might want to try V5! https://github.com/zazuko/trifid/releases/tag/trifid%405.0.0

mikel-egana-aranguren commented 6 months ago

Sure, I will give it a try and let you know how it goes

ludovicm67 commented 6 months ago

Hi @mikel-egana-aranguren!

First, thanks a lot for reporting the issue. Sorry that it took me quite some time to do the release of this new version, but there were many things that we needed to fix and test for this new release.

Version 5 is now a thing, and your issue should be fixed now 🎉

Here is a scenario that I used to check that everything is working as expected, by using your examples.

Create a docker-compose.yaml file with the following content:

version: "3"

services:
  trifid:
    image: ghcr.io/zazuko/trifid:v5.0.1
    ports:
      - "8080:8080"
    environment:
      SPARQL_ENDPOINT_URL: "http://sparql_endpoint:7878/query"
      DATASET_BASE_URL: "http://example.org/"

  sparql_endpoint:
    image: ghcr.io/oxigraph/oxigraph:0.4.0-alpha.4
    volumes:
      - ./oxigraph/data:/data
    ports:
      - "7878:7878"

Then start the stack using:

docker compose up -d

Then, run the following command to add a basic triple:

curl -X POST http://localhost:7878/update \
-H "Content-Type: application/sparql-update" \
--data-binary 'PREFIX ex: <http://example.org/>
               INSERT DATA {
                 ex:subject ex:predicate ex:object .
               }'

This will insert a triple into the triplestore.

Now, open your browser at http://0.0.0.0:8080/subject, and you should see the triple you just inserted, that is dereferenced from the triplestore, and which was rewritten.

On YASGUI, you will see that it is targeting the following endpoint: /query. This one is not displaying any rewritten results, to be aligned with the real data.

If you replace it with /query?rewrite=true, you will see the rewritten triple.

You can stop the stack by using:

docker compose stop
# or docker compose down

If there is any issue, just let us know :)

mikel-egana-aranguren commented 5 months ago

Hi @ludovicm67

It works perfectly. Thanks for the detailed response, nice!

Regards

ludovicm67 commented 5 months ago

Hi @mikel-egana-aranguren

Thank you for your feedback, and I'm glad to see that everything is working as expected for you now :) I will close this issue.

Best, Ludovic.