Closed FuhuXia closed 1 month ago
It turns out db-solr-sync
was doing the right thing capturing packages without harvest_objects. The false positive were actually duplicates borrowing other package's harvest_objects.
Another look at the current production daily report (~ 500 positives) seems to have the same issue.
Were all the duplicates false positives?
Nope. They are indeed duplicates, and the duplicate packages are using harvest_objects that does not belong to them, that is why db-solr-sync
also caught them. We can run either de-dupe process or db-solr-sync
task to fix them.
db-solr-sync
seems to give false positives on "Packages without harvest_object" count.Found the issue when doing solr clear and reindexing on the current catalog-dev. It reports two positives but the dataset looks fine. Another look at the current production daily report (~ 500 positives) seems to have the same issue.