Open little-seasalt opened 11 months ago
In general, you need to customize the dataloader and preprocess data for each dataset.
Customize the dataloader for each dataset (see SHA.py) and add it to datasets/init.py.
Preprocess the dataset, e.g., resizing the images and ground-truth points. This can save data loading time.
Regarding data augmentation, you may either try to train the model without scale augmentation or tune the scale augmentation parameters.
一般来说,您需要自定义数据加载器并预处理每个数据集的数据。
- 为每个数据集自定义数据加载器(请参阅SHA.py)并将其添加到datasets/ init .py。
- 预处理数据集,例如调整图像和地面实况点的大小。这可以节省数据加载时间。
- 关于数据增强,您可以尝试在不进行尺度增强的情况下训练模型,或者调整尺度增强参数。
Thank you for your answer.
Hello author: May I ask what kind of graphics card you used when training these data sets (UCF-QNRF, JHU-Crowd++ and NWPU-Crowd)? I frequently ran out of video memory during the training process. Especially when it occurs during eval, is there any way to solve this problem?
Typically, NVIDIA RTX 3090 is sufficient to train the model. Regarding CUDA out of memory
, you may try to reduce the batch size and use parallel training.
Typically, NVIDIA RTX 3090 is sufficient to train the model. Regarding
CUDA out of memory
, you may try to reduce the batch size and use parallel training.
In the process of training the UCF-QNRF dataset, I found that it takes about 6 minutes to train for one round and two minutes for each evaluation. Is this time consumption normal? I would like to ask the author how much time you spent on training at that time.
We suggest preprocessing the UCF-QNRF dataset before training, because loading the original images during training is time-consuming. After preprocessing, one epoch will take less than 40 seconds if you use two NVIDIA RTX 3090 for training.
We suggest preprocessing the UCF-QNRF dataset before training, because loading the original images during training is time-consuming. After preprocessing, one epoch will take less than 40 seconds if you use two NVIDIA RTX 3090 for training.
I have processed the UCF-QNRF dataset according to the operations mentioned in the paper, that is, limiting the long sides to 1536 pixels, and processing the size of both image and ground-truth points. The other parts of the data loader are written with reference to SHA.py. Are there any other data preprocessing operations that I have missed?
You shall resize the images and ground-truth points, and then save the preprocessed data. After that, you can use the preprocessed data to train the model. Resizing images on the fly is time-consuming.
Can anyone share the JHU.py file , i m getting dimension mismatches when train.sh is run Any insight is appreciated
Can anyone share the JHU.py file , i m getting dimension mismatches when train.sh is run Any insight is appreciated
Have you reproduced the paper metrics of the UCF-QNRF dataset? Perhaps you would like to share the relevant code?
Hello, I would like to ask, if I want to retrain the models on the UCF and JHU datasets, what changes need to be made to the existing code?