beeldengeluid / comunica-web-client

A Web-based client to query the Web of Linked Data, using SPARQL or GraphQL-LD, powered by Comunica.
https://comunica-web-client.vercel.app/
0 stars 0 forks source link

Improved performance of queries and cleaned them up #28

Closed mwigham closed 1 year ago

mwigham commented 1 year ago

Test plan:

Improvements made are listed below:

Also added some additional queries

Fixes: https://github.com/beeldengeluid/comunica-web-client/issues/27

vercel[bot] commented 1 year ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
comunica-web-client ✅ Ready (Inspect) Visit Preview Nov 1, 2022 at 6:21AM (UTC)
mwigham commented 1 year ago

Thank you very much for the review. I've checked in new versions according to the following:

wmelder commented 1 year ago

For the MOZ Concerts query: Why not filter on the series ID? Also a property path seems to be useful here.

PREFIX sdo: <https://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

# Items uit de collectie Muziekopnamen Zendgemachtigden
SELECT DISTINCT ?program ?programName
WHERE {
  VALUES ?series {
    <http://data.beeldengeluid.nl/id/series/2101608030025711131>
  }
  ?program sdo:name ?programName ;
           sdo:partOfSeason /sdo:partOfSeries ?series
} LIMIT 100
wmelder commented 1 year ago

De queries hebben nu allemaal als comment:

# Toon alleen items die ook op GPP te bekijken/beluisteren zijn.

Dit is een copy/paste foutje.

wmelder commented 1 year ago

For "concerts linked to Beethoven" I would suggest to use the GTAA URI, instead of filter( contains("beethoven")). It would be less clear perhaps because the URI doesn't reveil the preflabel right away, but we should encourage users to use the GTAA URI's instead of doing string matches.

    # use the GTAA URI for "Ludwig von Beethoven"
    VALUES ?entity_uri {
        <http://data.beeldengeluid.nl/gtaa/80808>
    }

you automatically exclude all the other beethovens. There are 5 other matches when you use the query to search the thesaurus

@mwigham This query for items linked to Beethoven is very fast:

PREFIX sdo: <https://schema.org/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

# Select items gelinkt aan Ludwig van Beethoven
SELECT DISTINCT ?program_uri ?programName 
WHERE {
  VALUES ?series {
    <http://data.beeldengeluid.nl/id/series/2101608030025711131>
  }
    # use the GTAA URI for "Ludwig von Beethoven"
    VALUES ?entity_uri {
        <http://data.beeldengeluid.nl/gtaa/80808>
    }
    ?program_uri sdo:name ?programName . 
?program_uri sdo:partOfSeason ?season . ?season sdo:partOfSeries ?series 
  {
    ?program_uri a sdo:CreativeWork ;
      (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer)/
      (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer) ?entity_uri  
  }
  UNION
  {
    ?program_uri sdo:isPartOfSeason ?season
    FILTER EXISTS {
    ?season  (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer)/
    (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer) ?entity_uri
  }
  }
  UNION
  {
    ?program_uri sdo:isPartOfSeason/sdo:isPartOfSeries ?series
    FILTER EXISTS {
      ?series  (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer)/
  (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer) ?entity_uri
    }
  }
  UNION
  {
    ?scene sdo:hasPart ?program_uri.
    FILTER EXISTS {
        ?scene  (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer)/
    (sdo:about|sdo:mentions|sdo:creator|sdo:contributor|sdo:actor|sdo:crew|sdo:performer) ?entity_uri 
  }
  } 
} LIMIT 100
mwigham commented 1 year ago

Thanks Willem.

My reason for matching on the names of the series and for Beethoven was readability, as these are example queries. We can change that to the IDs, as I think you are right, it is more important to teach them fast queries than to make the query more readable. In this case we won't then need the property path.

I don't see the GPP comment in any of the MOZ queries in this branch. Did you perhaps look at them online in the Comunica client itself? That won't be updated until this PR is merged.

mwigham commented 1 year ago

@wmelder I've checked in the changes, please take a look.

Re-reading some of your comments above, I think you may have reviewed old versions of the queries. Please read the files in the latest commit of this PR.

wmelder commented 1 year ago

I don't see the GPP comment in any of the MOZ queries in this branch. Did you perhaps look at them online in the Comunica client itself? That won't be updated until this PR is merged.

Yes, indeed I was foolishly just looking at the online client and not in the PR.

wmelder commented 1 year ago

@mwigham Queries are looking good!