Little tools to download and then weed through images, delete and classify them into groups for building deep learning image datasets (based on crawler and tkinter)
Apache License 2.0
133
stars
25
forks
source link
Resizing results from multiple searchengines (-c ALL) overwrites images #1
It looks like resize overwrites the output files when multiple crawlers are used. For example when resizing it goes throw the google results first and resizes 000001.jpg from the google results to the output folder. Then it resizes the the Bing results 00001.jpg and saves it to the sames folder overwriting the image from Google. And finally resizes the image 00001.jpg from Baidu and also saves it to the output folder overwriting the image from big.
So:
tmp/searchterm.google/000001.jpeg -> dataset/searchterm/000001.jpg
tmp/searchterm.bing/000001.jpeg -> dataset/searchterm/000001.jpg
tmp/searchterm.baidu/000001.jpeg -> dataset/searchterm/000001.jpg
Leaving only the image from the Baidu search in the output folder.
https://github.com/cwerner/fastclass/blob/4e418fa9aff2544b01052d005569b3b4912ca641/fc_download.py#L117
It looks like
resize
overwrites the output files when multiple crawlers are used. For example when resizing it goes throw the google results first and resizes 000001.jpg from the google results to the output folder. Then it resizes the the Bing results 00001.jpg and saves it to the sames folder overwriting the image from Google. And finally resizes the image 00001.jpg from Baidu and also saves it to the output folder overwriting the image from big.So: tmp/searchterm.google/000001.jpeg -> dataset/searchterm/000001.jpg tmp/searchterm.bing/000001.jpeg -> dataset/searchterm/000001.jpg tmp/searchterm.baidu/000001.jpeg -> dataset/searchterm/000001.jpg
Leaving only the image from the Baidu search in the output folder.