Owen Hua,
Puneet Kohli,
Pritish Uplavikar *,
Anand Ravi *,
Saravana Gunaseelan,
Jason Orozco,
Edward Li
Leia Inc.
* Denotes equal contribution
With the mass-market adoption of dual-camera mobile phones, leveraging stereo information in computer vision has become increasingly important. Current state-of-the-art methods utilize learning-based algorithms, where the amount and quality of training samples heavily influence results. Existing stereo image datasets are limited either in size or subject variety. Hence, algorithms trained on such datasets do not generalize well to scenarios encountered in mobile photography. We present Holopix50k, a novel in-the-wild stereo image dataset, comprising 49,368 image pairs contributed by users of the Holopix™ mobile social platform.
In order to download the Holopix50k dataset, you will need to run the following command in a Python3 environment and need either wget or curl installed on you machine.
To download the complete dataset, run scripts/download_holopix50k.sh
with the download path as follows:
./scripts/download_holopix50k.sh <DOWNLOAD_PATH>
You can also chose to download only the required dataset split by giving the following optional arguments to the script:
./scripts/download_holopix50k.sh <DOWNLOAD_PATH> [train|test|val]
The above commands will download the dataset at <DOWNLOAD_PATH>/Holopix50k
.
Note that the script temporarily installs the gsutil
tool to download the dataset. If you face issues installing
gsutil
, check out the official installation guide
here.
To download the dataset on Windows, you will need Python installed on your
machine. Once you have Python set up, download gsutil
from here and
extract the downloaded archive to some GSUTIL_ROOT
directory (for example, C:\gsutil
).
Now run the following command to download the complete Holopix50k dataset:
python [GSUTIL_ROOT]\gsutil -m cp -n -r gs://holopix50k-dataset/Holopix50k <DOWNLOAD_PATH>
If you want to download a particular SPLIT
("train", "test" or "val") of the Holopix50k dataset, change and run the
above command as follows:
python [GSUTIL_ROOT]\gsutil -m cp -n -r gs://holopix50k-dataset/Holopix50k/[SPLIT] <DOWNLOAD_PATH>
If you face issues installing gsutil
, follow the installation guide
here.
Note that the size of the dataset you are able to download may vary from the original dataset size of 49,368 stereo images. Holopix50k is a crowd sourced dataset from Holopix social media platform. The original user (who posts the image on Holopix) retains the copyrights of the images they post as mentioned in our LICENSE. Hence, if a user deletes their image from Holopix, it is removed from our dataset and won't be available for download. This is similar to how other crowd sourced datasets operate (eg. WSVD).
If you use the Holopix50k dataset in your work, please cite our paper
@InProceedings{hua2020holopix50k,
author = {Yiwen Hua and Puneet Kohli and Pritish Uplavikar and Anand Ravi and Saravana Gunaseelan and Jason Orozco and Edward Li},
title = {Holopix50k: A Large-Scale In-the-wild Stereo Image Dataset},
booktitle = {CVPR Workshop on Computer Vision for Augmented and Virtual Reality, Seattle, WA, 2020.},
month = {June},
year = {2020}
}