visipedia / fgvcx_flower_comp

MIT License
18 stars 3 forks source link

Banner

2018 FGVCx Flower Classification Challenge

The 2018 competition is part of the FGVC^5 workshop at CVPR. Our sponsor, the Xingse App 形色 (Chinese version) & PictureThis App (English version), has provided a dataset from a carefully curated database containing over 669,000 annotated flower images from 997 flower species.

Please open an issue if you have questions or problems with the dataset.

Kaggle

We are using Kaggle to host the leaderboard. Checkout the competition page here:

Dates

Data Released April 27, 2018
Submission Deadline June 8, 2018
Winners Announced June 22, 2018

Details

There are a total of 997 flower species in the dataset, with 669,304 training and validation images. The testing set contains 12,961 images.

Evaluation

We use top-1 error rate as the evaluation metric. For each image , an algorithm will produce 1 label . For this competition each image has one ground truth label , and the error for that image is:

Where

The overall error score for an algorithm is the average error over all test images:

Guidelines

Participants are restricted to train their algorithms on the 2018 FGVCx Flower Classification competition train and validation sets. Pretrained models may be used to construct the algorithms (e.g. ImageNet pretrained models) as long as participants do not actively collect additional data for the target species in the 2018 FGVCx Flower Classification competition. Please specify any and all external data used for training when uploading results.

The general rule is that we want participants to use only the provided training and validation images to train a model to classify the test images. We do not want participants crawling the web in search of additional data for the target categories. Participants should be in the mindset that this is the only data available for those categories.

Participants are allowed to collect additional annotations (e.g. bounding boxes) on the provided training and validation sets. Teams should specify that they collected additional annotations when submitting results.

Annotation Format

We closely follow the annotation format of the COCO dataset. For possibly better identification of flower species, extra infomation are provided:

  1. We provide when and where the images were uploaded. Note that those infomation are only available on ~60% of images.
  2. We provide taxonomic information including "Family" and "Genus" for all flower species.
  3. We provide extra 200,000 unlabeled images.

The annotations are stored in the JSON format and are organized as follows:

{
  "info" : info,
  "images" : [image],
  "categories" : [category],
  "annotations" : [annotation],
  "licenses" : [license]
}

info{
  "year" : int,
  "version" : str,
  "description" : str,
  "contributor" : str,
  "url" : str,
  "date_created" : datetime,
}

image{
  "id" : int,
  "width" : int,
  "height" : int,
  "file_name" : str,
  "license" : int,
  "rights_holder" : str
  "upload_latitude": float
  "upload_longitude": float
  "upload_date": str
}

category{
  "id" : int,
  "genus": str
  "family": str
  "name" : str,
}

annotation{
  "id" : int,
  "image_id" : int,
  "category_id" : int
}

license{
  "id" : int,
  "name" : str,
  "url" : str
}

Submission Format

The submission format for the Kaggle competition is a csv file with the following format:

id,predicted
12345, 23
67890, 42

The id column corresponds to the test image id. The predicted column corresponds to 1 predicted category id. You should have one row for each test image.

Terms of Use

By downloading this dataset you agree to the following terms:

  1. You will abide by the Xingse & PictureThis User Agreement (EULA)
  2. You will use the data only for non-commercial research and educational purposes.
  3. You will NOT distribute the above images.
  4. Xingse & PictureThis makes no representations or warranties regarding the data, including but not limited to warranties of non-infringement or fitness for a particular purpose.
  5. You accept full responsibility for your use of the data and shall defend and indemnify Xingse & PictureThis, including its employees, officers and agents, against any and all claims arising from your use of the data, including but not limited to your use of any copies of copyrighted images that you may create from the data.

Data

Download the dataset files from Kaggle competition page:

For participants in China, the downloading from Kaggle might be very slow. Please feel free to use the following link: