alphagov / govuk-knowledge-graph-gcp

GOV.UK content data and cloud infrastructure for the GovSearch app.
https://docs.data-community.publishing.service.gov.uk/tools/govgraph/
MIT License
8 stars 1 forks source link

fix: extract step-by-step content correctly #655

Closed nacnudus closed 4 months ago

nacnudus commented 4 months ago

It turns out that not every step-by-step page has its content rendered to HTML in the details.body field. This commit introduces a JavaScript UDF in BigQuery to extract the content as govspeak.

world_location_news is now also included.