Closed michielbdejong closed 3 years ago
From select id, service_id, url from documents where url in (select distinct a.url from documents a inner join documents b on a.url = b.url where a.id != b.id) order by url
it seems that a lot of duplicate documents happen within the same service. So let me try to clean that up first.
In order to resolve the duplicate services problem, it makes sense to attack the duplicate documents problem first, because it's not desirable to merge two services and then have its documents end up being duplicated.