train with Shanghai Tech Part A and Part B dataset

cvlab-stonybrook / DM-Count

Code for NeurIPS 2020 paper: Distribution Matching for Crowd Counting.

MIT License

218 stars 52 forks source link

train with Shanghai Tech Part A and Part B dataset #2

Closed 18150167970 closed 4 years ago

18150167970 commented 4 years ago

Hello, I want know if this project has preporcess dataset with Shanghai Tech Part A and Part B dataset, we want to test this excellent model in ours dataset. Thank you for you help.

Boyu-Wang commented 4 years ago

Hi, Thanks for your interest! We did not preprocess the Shanghai Tech Part A or Part B dataset. During training, random crops are taken. The crop size is 256 for Part A and 512 for Part B.

enric1994 commented 4 years ago

Hi @Boyu-Wang , how can I train ShanghaiTech then? There is no option to preprocess the dataset and when I run train.py I get this error ValueError: num_samples should be a positive integer value, but got num_samples=0because it expects the data in the "preprocessed" format.

Can you propose a solution to train with the ShanghaiTech dataset?

Thanks!

enric1994 commented 4 years ago

Okay, it is working now. I had to change the names of the Shanghai dataset to train and val instead of train_data. I also did the split myself.

wjourney commented 4 years ago

Okay, it is working now. I had to change the names of the Shanghai dataset to train and val instead of train_data. I also did the split myself.

hello, can you tell me how you train the shanghai dataset, i meet the same questiion with you

enric1994 commented 4 years ago

On train_helper.py replace from datasets.crowd import Crowd_qnrf to from datasets.crowd import Crowd_sh (line 11). On line 53, replace Crowd_sh again. Also, make sure that you renamed the dataset folder to train instead of train_data. Since the shanghai dataset doesn't have a validation folder, you may want to split train in two or (probably not right) rename test_data to val. Last but not least, make sure that the ground truth folders are named ground_truth instead of ground-truth.

Hope it helps!

Boyu-Wang commented 4 years ago

@enric1994 Thanks for your reply!

I've updated the code and README file. It supports training on multiple datasets. To train on multiple dataset, you could run this: python train.py --dataset <dataset name: qnrf, sha, shb or nwpu> --data-dir --device

fatbringer commented 2 years ago

Hi!

Do you know why are the ShanghaiTech A and B datasets always trained separately ? Instead of training as 1 combined dataset ?

alercelik commented 2 years ago

Part A and Part B are treated as different datasets. Aim is not to achieve a global best result among all datasets but to achieve best result in each dataset independently.

fatbringer commented 2 years ago

@alercelik I have read that Shanghai tech A is representative of dense crowds, while Shanghai tech B is representative of sparse crowds.

Does that mean that for counting people, using these 2 datasets, we will always be using 2 sets of weights ? To address both ranges ?

Do you know what is the range of people that is ideal/best represented by the respective dataset ?

alercelik commented 2 years ago

Yes, different weights are used for each dataset. You may look at some academic papers and see that these two parts are always treated as different datasets.