comunica / comunica-feature-link-traversal

📬 Comunica packages for link traversal-based query execution
Other
8 stars 11 forks source link

ldp:contains links not being followed #121

Closed smessie closed 6 months ago

smessie commented 8 months ago

Issue type:


Description:

Link traversal is not performed. It only requests the link that is provided in the context's sources, even though it should follow other links as that document contains ldp:contains triples.

The code in my Web client app:

import {QueryEngine} from "@comunica/query-sparql-link-traversal-solid";

const engine = new QueryEngine();

const query = `
    PREFIX ldes: <https://w3id.org/ldes#>
    PREFIX tree: <https://w3id.org/tree#>
    PREFIX ldp: <http://www.w3.org/ns/ldp#>
    PREFIX dc: <http://purl.org/dc/terms/>

    SELECT ?member ?dateTime
    WHERE {
      <${ldesUrl}> a ldes:EventStream;
                   tree:view ?view.
      ?view a tree:Node;
            tree:relation ?relation.
      ?relation a ?relationType;
                tree:node ?node.
      ?node a ldp:BasicContainer;
            ldp:contains ?member.
      ?member a ldp:Resource;
              dc:modified ?dateTime.
      FILTER (?node = <${fragmentUri}>).
    }`;
const bindings = await engine.queryBindings(query, {sources: [ldesUrl], fetch: customFetch, lenient: false});

With this the query being executed after filling in the javascript variables (printed via console log to make sure that the query is what I think it is):

    PREFIX ldes: <https://w3id.org/ldes#>
    PREFIX tree: <https://w3id.org/tree#>
    PREFIX ldp: <http://www.w3.org/ns/ldp#>
    PREFIX dc: <http://purl.org/dc/terms/>

    SELECT ?member ?dateTime
    WHERE {
      <http://localhost:3000/researcher-test/ldesinldp/#EventStream> a ldes:EventStream;
                   tree:view ?view.
      ?view a tree:Node;
            tree:relation ?relation.
      ?relation a ?relationType;
                tree:node ?node.
      ?node a ldp:BasicContainer;
            ldp:contains ?member.
      ?member a ldp:Resource;
              dc:modified ?dateTime.
      FILTER (?node = <http://localhost:3000/researcher-test/ldesinldp/1693558885201/>).
    }

The content of http://localhost:3000/researcher-test/ldesinldp/#EventStream:

<> a <https://w3id.org/tree#Node>, ldp:Container, ldp:BasicContainer, ldp:Resource;
    <https://w3id.org/tree#relation> _:b8_b61_b58_b55_b52_b49_b46_b43_b40_b37_b34_b26_b21_e_b00;
    <https://w3id.org/tree#viewDescription> <#ViewDescription>;
    ldp:inbox <1693558885201/>;
    dc:modified "2023-09-08T12:28:12.000Z"^^xsd:dateTime.
_:b8_b61_b58_b55_b52_b49_b46_b43_b40_b37_b34_b26_b21_e_b00 a <https://w3id.org/tree#GreaterThanOrEqualToRelation>;
    <https://w3id.org/tree#path> dc:created;
    <https://w3id.org/tree#value> "2023-09-01T09:01:25.201Z"^^xsd:dateTime;
    <https://w3id.org/tree#node> <1693558885201/>.
<#ViewDescription> a <https://w3id.org/tree#ViewDescription>;
    <http://www.w3.org/ns/dcat#servesDataset> <#EventStream>;
    <http://www.w3.org/ns/dcat#endpointURL> <>;
    <https://w3id.org/ldes#managedBy> <#LDESinLDPClient>.
<1693558885201/> a ldp:Container, ldp:BasicContainer, ldp:Resource;
    dc:modified "2023-11-09T08:15:43.000Z"^^xsd:dateTime.
<#EventStream> a <https://w3id.org/ldes#EventStream>;
    <https://w3id.org/tree#view> <>.
<#LDESinLDPClient> a <https://w3id.org/ldes#LDESinLDPClient>;
    <https://w3id.org/ldes#bucketizeStrategy> <#BucketizeStrategy>.
<#BucketizeStrategy> a <https://w3id.org/ldes#BucketizeStrategy>;
    <https://w3id.org/tree#path> dc:created;
    <https://w3id.org/ldes#bucketType> <https://w3id.org/ldes#timestampFragmentation>.
<> posix:mtime 1694176092;
    ldp:contains <1693558885201/>.
<1693558885201/> posix:mtime 1699517743.

Content of http://localhost:3000/researcher-test/ldesinldp/1693558885201/:

@prefix dc: <http://purl.org/dc/terms/>.
@prefix ldp: <http://www.w3.org/ns/ldp#>.
@prefix posix: <http://www.w3.org/ns/posix/stat#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.

<> a ldp:Container, ldp:BasicContainer, ldp:Resource;
    dc:modified "2023-11-09T08:15:43.000Z"^^xsd:dateTime.
<7a6f00f1-b72e-4202-bff4-0365de5431f9> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-09-05T10:55:45.000Z"^^xsd:dateTime.
<dbd5725e-413f-4213-b8bc-c2e77d8b0445> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-09-05T10:55:45.000Z"^^xsd:dateTime.
<mattermost-example> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-10-31T14:31:18.000Z"^^xsd:dateTime.
<spec-example> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-10-31T14:29:14.000Z"^^xsd:dateTime.
<18252ab4-e48b-4830-8843-e68d4d5084f5> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-10-30T09:24:20.000Z"^^xsd:dateTime.
<fd2144a3-b4ec-4408-aa61-61101b9f88c0> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-09-05T10:55:45.000Z"^^xsd:dateTime.
<unknown-type> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-11-09T08:15:43.000Z"^^xsd:dateTime.
<6d98e184-d5b6-4ab4-b5d2-2d467711db0b> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-10-30T09:24:20.000Z"^^xsd:dateTime.
<ee255622-352d-400b-a45d-d17e95a9bcd9> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-09-05T10:55:45.000Z"^^xsd:dateTime.
<ae92b258-4a5a-4066-b97f-e64353e9574c> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-09-05T10:55:45.000Z"^^xsd:dateTime.
<0a61ee82-d11c-40a4-b7c0-e72ec9aa6a6c> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-09-05T10:55:45.000Z"^^xsd:dateTime.
<2e5ad3e8-c8ce-4cb5-9255-02c88b4953ae> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-09-05T10:55:45.000Z"^^xsd:dateTime.
<optionals> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-11-08T09:11:43.000Z"^^xsd:dateTime.
<b9527304-f1bc-435f-b488-bd441db63806> a ldp:Resource, <http://www.w3.org/ns/iana/media-types/application/ld+json#Resource>;
    dc:modified "2023-10-06T12:42:04.000Z"^^xsd:dateTime.
<> posix:mtime 1699517743;
    ldp:contains <7a6f00f1-b72e-4202-bff4-0365de5431f9>, <dbd5725e-413f-4213-b8bc-c2e77d8b0445>, <mattermost-example>, <spec-example>, <18252ab4-e48b-4830-8843-e68d4d5084f5>, <fd2144a3-b4ec-4408-aa61-61101b9f88c0>, <unknown-type>, <6d98e184-d5b6-4ab4-b5d2-2d467711db0b>, <ee255622-352d-400b-a45d-d17e95a9bcd9>, <ae92b258-4a5a-4066-b97f-e64353e9574c>, <0a61ee82-d11c-40a4-b7c0-e72ec9aa6a6c>, <2e5ad3e8-c8ce-4cb5-9255-02c88b4953ae>, <optionals>, <b9527304-f1bc-435f-b488-bd441db63806>.
<7a6f00f1-b72e-4202-bff4-0365de5431f9> posix:mtime 1693911345;
    posix:size 933.
<dbd5725e-413f-4213-b8bc-c2e77d8b0445> posix:mtime 1693911345;
    posix:size 935.
<mattermost-example> posix:mtime 1698762678;
    posix:size 900.
<spec-example> posix:mtime 1698762554;
    posix:size 1140.
<18252ab4-e48b-4830-8843-e68d4d5084f5> posix:mtime 1698657860;
    posix:size 1009.
<fd2144a3-b4ec-4408-aa61-61101b9f88c0> posix:mtime 1693911345;
    posix:size 935.
<unknown-type> posix:mtime 1699517743;
    posix:size 1290.
<6d98e184-d5b6-4ab4-b5d2-2d467711db0b> posix:mtime 1698657860;
    posix:size 1219.
<ee255622-352d-400b-a45d-d17e95a9bcd9> posix:mtime 1693911345;
    posix:size 935.
<ae92b258-4a5a-4066-b97f-e64353e9574c> posix:mtime 1693911345;
    posix:size 935.
<0a61ee82-d11c-40a4-b7c0-e72ec9aa6a6c> posix:mtime 1693911345;
    posix:size 933.
<2e5ad3e8-c8ce-4cb5-9255-02c88b4953ae> posix:mtime 1693911345;
    posix:size 935.
<optionals> posix:mtime 1699434703;
    posix:size 817.
<b9527304-f1bc-435f-b488-bd441db63806> posix:mtime 1696596124;
    posix:size 936.

In the network tab of my browser I can witness only http://localhost:3000/researcher-test/ldesinldp/ being requested. When I however manually add fragmentUri (=http://localhost:3000/researcher-test/ldesinldp/1693558885201/) to the context's sources, the query executes as it is supposed to and find all expected results. This should prove that my query is correct. I would however expect link traversal to find its way to this (fragmentUri) resource by either following ldp:contains or the link in the query.


Environment:

"@comunica/query-sparql-link-traversal-solid": "^0.2.0" (package-lock.json and node_modules removed and installed again)

Running in Web client environment (Vue with Vite)

Crash log:

rubensworks commented 8 months ago

Don't see anything wrong there at first glance. This should be a good starting point for debugging: https://github.com/comunica/comunica-feature-link-traversal/blob/master/packages/actor-extract-links-predicates/lib/ActorExtractLinksPredicates.ts#L25

smessie commented 7 months ago

I'm really lost. I first tried debugging by npm linking comunica in the project, but in the end I started debugging using the command line version of the locally cloned comunica LTQP repo. However, that gave correct results, as well as the online browser version.

I continued debugging and made JavaScript versions in plain Node, with Vite and with Vue. They can be found here: https://github.com/smessie/comunica-ltqp-debugging

With just Node, I get the correct results as expected, with Vite and Vue however, the query results in no results and does not follow any links (if you look in the network tab of your browser).

Any ideas on the cause for this?

smessie commented 7 months ago

After some more hours of debugging I found a workaround by accident while making the console.log in Comunica working (it also made Comunica Link Traversal working 😄)

By configuring the NodeModulesPolyfillPlugin in the vite.config.js, it resolved yet another incompatibility with porting the NodeJS targeted comunica to the browser.

export default defineConfig({
    optimizeDeps: {
        esbuildOptions: {
            plugins: [
                NodeModulesPolyfillPlugin(),
            ],
        }
    }
})

See also https://github.com/smessie/comunica-ltqp-debugging/commit/e937234bbdd04bb301cbbaa70ed7ed59cf57afe2 to see how the MRE now works with this extra configuration.

The warning that appeared in the browser console (which I unfortunately was ignoring at first) should explain more on the specific underlying issue.

browser-external:util:9 Module "util" has been externalized for browser compatibility. Cannot access "util.debuglog" in client code. See http://vitejs.dev/guide/troubleshooting.html#module-externalized-for-browser-compatibility for more details.
get @   browser-external:util:9
node_modules/readable-web-to-node-stream/node_modules/readable-stream/lib/_stream_readable.js   @   _stream_readable.js:55
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/readable-web-to-node-stream/node_modules/readable-stream/readable-browser.js   @   readable-browser.js:1
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/readable-web-to-node-stream/lib/index.js   @   index.js:4
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/bus-http/lib/ActorHttp.js    @   ActorHttp.ts:3
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/bus-http/lib/index.js    @   index.ts:1
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/actor-http-fetch/lib/ActorHttpFetch.js   @   ActorHttpFetch.ts:2
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/actor-http-fetch/lib/index.js    @   index.ts:1
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/query-sparql-link-traversal-solid/engine-default.js  @   engine-default.js:313
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/query-sparql-link-traversal-solid/lib/QueryEngine.js @   QueryEngine.ts:4
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/query-sparql-link-traversal-solid/lib/index-browser.js   @   index-browser.ts:3
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
(anonymous)

and

Module "util" has been externalized for browser compatibility. Cannot access "util.inspect" in client code. See http://vitejs.dev/guide/troubleshooting.html#module-externalized-for-browser-compatibility for more details.
get @   browser-external:util:9
node_modules/readable-web-to-node-stream/node_modules/readable-stream/lib/internal/streams/buffer_list.js   @   buffer_list.js:14
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/readable-web-to-node-stream/node_modules/readable-stream/lib/_stream_readable.js   @   _stream_readable.js:62
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/readable-web-to-node-stream/node_modules/readable-stream/readable-browser.js   @   readable-browser.js:1
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/readable-web-to-node-stream/lib/index.js   @   index.js:4
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/bus-http/lib/ActorHttp.js    @   ActorHttp.ts:3
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/bus-http/lib/index.js    @   index.ts:1
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/actor-http-fetch/lib/ActorHttpFetch.js   @   ActorHttpFetch.ts:2
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/actor-http-fetch/lib/index.js    @   index.ts:1
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/query-sparql-link-traversal-solid/engine-default.js  @   engine-default.js:313
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/query-sparql-link-traversal-solid/lib/QueryEngine.js @   QueryEngine.ts:4
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
node_modules/@comunica/query-sparql-link-traversal-solid/lib/index-browser.js   @   index-browser.ts:3
__require2  @   chunk-7REXU52E.js?v=c085e4a3:19
(anonymous)
surilindur commented 7 months ago

It looks like it could be related to readable-web-to-node-stream using readable-stream version 3.x and not 4.x, and the 3.x versions seem to be using those functions from util without providing their own polyfills.

No idea how to fix that, since there is no newer version available of that readable-web-to-node-stream package that would use a newer readable-stream.

rubensworks commented 7 months ago

We seem to be using that package in our base http bus here: https://github.com/search?q=repo%3Acomunica%2Fcomunica%20readable-web-to-node-stream&type=code The package doesn't seem to be maintained anymore. But since it's licensed under MIT, we can probably fork it, and depend on the latest readable-stream version. Something one of you want to look into @smessie or @surilindur? (probably only after the ESWC deadlines 😉)

smessie commented 5 months ago

After updating the dependencies, I've realised that this update to the @smessie/readable-web-to-node-stream@^3.0.3 works, but however not only the base http bus is using that package, but also the fetch-sparql-endpoint.js library used by some actors which will also have to update to the newest fetch-sparql-endpoint version once https://github.com/rubensworks/fetch-sparql-endpoint.js/pull/72 gets merged.

Edit: well, actually that last step should not be needed as doing a clean npm/yarn install should update to the newest version of the fetch-sparql-endpoint dependency then.

smessie commented 5 months ago

Another day, another update.

With all dependencies updated to their latest versions, I still need to polyfill the global variable before LTQP is working. I've updated my example app to deliver a reproducible example: https://github.com/smessie/comunica-ltqp-debugging/tree/main/vue

With global being polyfilled: The 4 expected results are being returned by following the ldp:contains links in the initial resource. image

Without global being polyfilled: Zero results are being returned, and you can notice that only the initial resource is being retrieved and no other links are being followed. image

To try out yourself, you can use the Vue application in my comunica-ltqp-debugging repository and comment (no results) / uncomment (the 4 expected results) the global polyfill.

https://github.com/smessie/comunica-ltqp-debugging/blob/a4a111590d30d8835648e843c75f425233bf8974/vue/vite.config.js#L13

The thing is however, where previously there was an error or warning pointing out where a not polyfilled variable was being used, now there are no errors in the console, making it hard to find the root cause of the issue.

rubensworks commented 5 months ago

AFAIK, we use globalThis everywhere. Maybe some of our deps is using global instead of globalThis?

smessie commented 5 months ago

Maybe yes, but I can't seem to find any.