But it is a dict - Githubissues

adrianreber commented 8 years ago

There was some confusion about the list of files which is checked by the crawler. The original code was looping over the keys of a dict. Which seemed to been an unlikely result from a database query. The PR

https://github.com/fedora-infra/mirrormanager2/pull/107

changed it to loop over an existing database query result list. This change, however, resulted that the crawler was only looking at repodata directories:

https://github.com/fedora-infra/mirrormanager2/issues/131

The reason the crawler actually has to loop over the keys of a dict is that umdl reads all the directories and creates a dict of the (maybe 10) newest files in that directory. This dict is then pickled and stored in the database and read by the crawler.

Instead of simply reverting the commit which removed the loop over the dict, this change keeps all other improvements and only changes the loop to use the pickled dict again.

Successfully tested in the staging environment.

Signed-off-by: Adrian Reber adrian@lisas.de

adrianreber commented 8 years ago

Thanks for the review. I updated the patch to include the necessary fixes.

pypingou commented 8 years ago

:+1: for me

fedora-infra / mirrormanager2

But it is a dict #138