alphagov / govuk-content-metadata

GovNER: an encoder-based language model (RoBERTa) fine-tuned to perform Named Entity Recognition (NER) on GOV.UK content
MIT License
4 stars 1 forks source link

create named-entity urls using standard URI econding #67

Closed exfalsoquodlibet closed 1 year ago

exfalsoquodlibet commented 1 year ago

Updated sql script for post-extraction processing to create named-entity urls using standard URI econding as advised by devs.

Info: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent

The relevant Scheduled Query has been also updated accordingly:

Some test cases can be seen here: https://console.cloud.google.com/bigquery?project=cpto-content-metadata&ws=!1m4!1m3!8m2!1s673804617052!2s95dafc4a00b646cea5711c91490452a1

Summary

Add your summary here - keep it brief, to the point, and in plain English. For further information about pull requests, check out the GDS Way.

Checklists

This pull/merge request meets the following requirements:

Comments have been added below around the incomplete checks.