alphagov / govuk-related-links-recommender

Machine learning model to recommend related content
MIT License
19 stars 4 forks source link

Restrict hostname to www.gov.uk #171

Open nacnudus opened 2 years ago

nacnudus commented 2 years ago

Now that domains other than www.gov.uk are in the data, we need to filter for www.gov.uk only, in case there are pagePath collisions. The main one / is already excluded, but there might be others.

https://github.com/alphagov/govuk-related-links-recommender/blob/aea449c35ba504c0103bb279bcd0a01d0d5110e6/src/data_preprocessing/query_content_id_edge_weights.sql#L48

nacnudus commented 1 year ago

Do as part of https://github.com/alphagov/govuk-related-links-recommender/issues/174