visualize-admin / visualization-tool

The tool for visualizing Swiss Open Government Data. Project ownership: Federal Office for the Environment FOEN
https://visualize.admin.ch
BSD 3-Clause "New" or "Revised" License
30 stars 3 forks source link

feat: Managed cached endpoint #1599

Closed bprusinowski closed 3 months ago

bprusinowski commented 3 months ago

Closes #1596

This PR:

vercel[bot] commented 3 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
visualization-tool ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 2, 2024 10:20am
bprusinowski commented 3 months ago

cc @Rdataflow, waiting for the PROD endpoint to be fixed so we can properly test the change and implement any potential additional requirements (e.g. encoding the cube iri) 👀

ptbrowne commented 3 months ago

Maybe you could extract all possible variable types with some typescript-fu:

type ExtractResolversObject<O> = O extends ResolversObject<infer S> ? S :never
type B = ExtractResolversObject<QueryResolvers>
type ExtractResolver<O> =  O extends Resolver<any, any, any, infer S> ? S : never
type Vars = ExtractResolver<B[keyof B]>

then you can as the variableInfos from the graphql context to Vars ?

bprusinowski commented 3 months ago

Thanks for the idea @ptbrowne, I will check it out tomorrow 💯

ptbrowne commented 3 months ago

Ah this time, the end to end tests seem to have found something, I do not recognize the usual suspects in the failed E2E tests.

bprusinowski commented 3 months ago

Yes, I think it might be related to the fact that with this PR, the PROD endpoint is broken, so any test relying on it won't work. I'll re-check once it's fixed by Zazuko 👀

Rdataflow commented 3 months ago

@bprusinowski can you add statistics on visit frequency to address case b)? (i.e. visits per chart key)

bprusinowski commented 3 months ago

@Rdataflow yes, that's the plan to try to add a new table to our database that would store information on views per chart config 👍

Rdataflow commented 3 months ago

@Rdataflow yes, that's the plan to try to add a new table to our database that would store information on views per chart config 👍

@bprusinowski curious: is there already some PR around ?

bprusinowski commented 3 months ago

@Rdataflow yes, this was already implemented in https://github.com/visualize-admin/visualization-tool/pull/1613 😄

Rdataflow commented 3 months ago

@bprusinowski the new varnish config is now on PROD 👍 and unblocks this PR 🚀

bprusinowski commented 3 months ago

Thanks for a hint @Rdataflow! In this case I'll also take a look at pre-populating the cache for most viewed charts and then merge the PR. Will do it tomorrow 👍

bprusinowski commented 3 months ago

@Rdataflow the changes should soon be on TEST :)

Rdataflow commented 3 months ago

@bprusinowski by inspecting the traffic I spotted 3 types of queries which shall also be passed to ${endpoint}/${cubeIri} (currently they are sent to default endpoint)

rationale is to harden the cached charts for the case of db outage and max out long term cache :smile:

SELECT ?iri WHERE { {

Versioned cube.

SELECT ?iri ?version WHERE {
  VALUES ?oldIri { <https://environment.ld.admin.ch/foen/ubd000503bis/2> }
  ?versionHistory schema:hasPart ?oldIri .
  ?versionHistory schema:hasPart ?iri .
  ?iri schema:version ?version .
  ?iri schema:creativeWorkStatus ?status .
  ?oldIri schema:creativeWorkStatus ?oldStatus .
  FILTER(NOT EXISTS { ?iri schema:expires ?expires . } && ?status IN (?oldStatus, <https://ld.admin.ch/vocabulary/CreativeWorkStatus/Published>))
}
ORDER BY DESC(?version)

} UNION { {

Version history of a cube.

  SELECT ?iri ?status ?version WHERE {
    VALUES ?versionHistory { <https://environment.ld.admin.ch/foen/ubd000503bis/2> }
    ?versionHistory schema:hasPart ?iri .
    ?iri schema:version ?version .
    ?iri schema:creativeWorkStatus ?status .
    FILTER(NOT EXISTS { ?iri schema:expires ?expires . })
  }
  ORDER BY DESC(?status) DESC(?version)
}

} UNION { {

Non-versioned cube.

  SELECT ?iri ?status WHERE {
    VALUES ?iri { <https://environment.ld.admin.ch/foen/ubd000503bis/2> }
    ?iri cube:observationConstraint ?shape .
    ?iri schema:creativeWorkStatus ?status .
    FILTER(NOT EXISTS { ?iri schema:expires ?expires . } && NOT EXISTS { ?versionHistory schema:hasPart ?iri . })
  }
  ORDER BY DESC(?status)
}

} } LIMIT 1


- query 2: some dimension version query pattern
```sparql
PREFIX cube: <https://cube.link/>
PREFIX schema: <http://schema.org/>
PREFIX sh: <http://www.w3.org/ns/shacl#>

SELECT ?dimensionIri ?version ?nodeKind WHERE {
  <https://environment.ld.admin.ch/foen/ubd000503bis/2> cube:observationConstraint/sh:property ?dimension .
  ?dimension sh:path ?dimensionIri .
  OPTIONAL { ?dimension schema:version ?version . }
  OPTIONAL { ?dimension sh:nodeKind ?nodeKind . }
  FILTER(?dimensionIri IN (<https://environment.ld.admin.ch/foen/ubd000503bis/treibstoffe>))
}

SELECT ?dimension0_v WHERE { https://environment.ld.admin.ch/foen/ubd000503bis/2 cube:observationSet/cube:observation ?observation . ?observation https://environment.ld.admin.ch/foen/ubd000503bis/treibstoffe ?dimension0 . ?dimension0 schema:sameAs ?dimension0_v . VALUES ?dimension0_v { https://environment.ld.admin.ch/foen/ubd000503bis/Treibstoffe/treib1 } }

LIMIT 1

bprusinowski commented 3 months ago

😱 thanks for spotting this @Rdataflow, I will investigate why this is the case 👍