Enquiry on preparing dataset for human re-identification

Cysu / dgd_person_reid

Domain Guided Dropout for Person Re-identification

http://arxiv.org/abs/1604.07528

231 stars 94 forks source link

Enquiry on preparing dataset for human re-identification #18

Closed deweilow closed 7 years ago

deweilow commented 7 years ago

I am working on a project for human re-identification using Caffe. I have obtained the CUHK01 public dataset which contains 971 identities with 2 images per view. I want to spilt the training set into 971 labels for each identity but i am unsure of the python code for doing it. The images are labelled 0001001.png0001002.png,0001003.png,0001004.png for the first identity and then it repeats for the reamining 970 identities. Is there any suggestion or codes advice for me to allocate the images to the correct labels

Cysu commented 7 years ago

Please refer to this script for processing cuhk01 dataset.

https://github.com/Cysu/dgd_person_reid/blob/master/data/format_cuhk01.py

deweilow commented 7 years ago

Hi, thank you for your prompt reply. As i am new to caffe, python programming and human re-identification project, may i seek your clarification on what does the program in the link you have sent above do to process the dataset?

Can i also enquire on how to you do training and testing for human re-identification dataset. For example CUHK01 dataset has 971 identities with 2 images per view. Do you allocate use 3 images from the 971 identities for training and allocate one label for each identity(971 identities). Then you use the remaining 1 image from each identity to predict the results of the training? So sorry for the inconvenience and thank you for assisting me.

Cysu commented 7 years ago

We first split the 971 identities into 871 for training/validation and 100 for test. Then we split the images of the 871 identities into training and validation sets.

The training/validation objective is to predict the id of each person. While during the test, we ignore the classifier and only compare distance between the features.