Closed Conchylicultor closed 4 years ago
@Conchylicultor I would like to take this issue, working on it
There are some unnecessary files in caltech_birds2011 dataset
@vijayphoenix Don't understand why it is unnecessary, it is present in CUB_200_2011.tar.gz
@Eshan-Agarwal the files were downloaded but never used for dataset generation
What's left in this issue?
The fake data for
caltech_birds2011
is way to big (> 100MB). We should investigate where does this huge size comes from and try to reduce it.Fake data is at https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/testing/test_data/fake_examples/caltech_birds2011 The test is at: https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/image_classification/caltech_birds_test.py
Among the other datasets which have huge fake data size are:
Those datasets take more than 70% of all fake data size. caltech_birds2011 is almost half of it. Reducing the size of those fake data would have a huge impact on our github repository size.