The list of docker images available on dockerhub is not publicly available. Schermann et al. 2018 crawled github to find docker images repository, but this is not suitable for our case. I used the dockerhub website API, which has a bug and do not support fetching more than 10,000 results.
To solve that problem, I created a python script that simulates a web browser and crawls the docker image space by recursively making search queries to the docker API, iterating over possible querystrings to reduce the amount of results for each request. I was able to retrieve a list of 1.7 Million docker images.
The list of docker images available on dockerhub is not publicly available. Schermann et al. 2018 crawled github to find docker images repository, but this is not suitable for our case. I used the dockerhub website API, which has a bug and do not support fetching more than 10,000 results.
To solve that problem, I created a python script that simulates a web browser and crawls the docker image space by recursively making search queries to the docker API, iterating over possible querystrings to reduce the amount of results for each request. I was able to retrieve a list of 1.7 Million docker images.