EscVM / OIDv4_ToolKit

Download and visualize single or multiple classes from the huge Open Images v4 dataset
GNU General Public License v3.0
809 stars 635 forks source link

Image level download for positive labels only #34

Closed JustusWest closed 5 years ago

JustusWest commented 5 years ago

Hi guys, Love this library, it's super helpful and I've been using it to download images for my research project. I was downloading image level label images related to alcohol and noticed that an image will get downloaded into the directory corresponding to a label even if the label is negative for that image. ie an image where Beer=1 and Wine=0 in the train-annotations-human-imagelabels.csv will get downloaded in both the Beer and Wine directories. Is there a way to download images only into directories that correspond to positive labels?

keldrom commented 5 years ago

@JustusWest thank you for the appreciations :D btw, if you write the command that u used I will try to understand the problem because the usage of the new dataset (with no bounding boxes) maybe made some new issues like this.

JustusWest commented 5 years ago

I used: python3 main.py downloader_ill --sub h --classes classes.txt --type_csv train

keldrom commented 5 years ago

@JustusWest what's in your classes.txt files?

JustusWest commented 5 years ago

classes.txt

keldrom commented 5 years ago

I think that you're .txt file could be like this: Beer Wine Alcohol Non-alcoholic beverage

Btw I think that there can be some "overlapping" on the datasets: i.e. a bottle of wine will be also into the dataset of Alcohol. So in this case there are no issues on our toolkit, it's the provided dataset that includes maybe the same image into the two different sub-dataset. So in this case is right the download of the same image into two different subset.