alphagov / govuk-content-metadata

GovNER: an encoder-based language model (RoBERTa) fine-tuned to perform Named Entity Recognition (NER) on GOV.UK content
MIT License
4 stars 1 forks source link

Use different bq table as input to bulk_inference_pipeline #78

Closed exfalsoquodlibet closed 1 year ago

exfalsoquodlibet commented 1 year ago

Updated bulk_inference_pipeline/src/sql_queries.py to use govuk-knowledge-graph.graph.page as input table for urls.

This will ensure all urls are included, also those made of base_path and slug like those of guide parts.

Summary

Add your summary here - keep it brief, to the point, and in plain English. For further information about pull requests, check out the GDS Way.

Checklists

This pull/merge request meets the following requirements:

Comments have been added below around the incomplete checks.