Closed Kins1ley closed 4 years ago
- there're 10 classes in total, we expect that each class take account for about 10% of the total samples. but in fact, there exists objects of various categories in one sample, so increase gt numbers of rare classes will also increase that of major classes. so the final distribution is not a perfect uniform distribution.
- after the sampling procedure, the dataset is actually 4.5 times larger than the original dataset. for fair comparison, we divide gt numbers by 4.5, so that we can compare the gt instances distribution before and after sampling.
Thanks for your reply. Are codes for data sampling in nuscenes.py load_infos?
- there're 10 classes in total, we expect that each class take account for about 10% of the total samples. but in fact, there exists objects of various categories in one sample, so increase gt numbers of rare classes will also increase that of major classes. so the final distribution is not a perfect uniform distribution.
- after the sampling procedure, the dataset is actually 4.5 times larger than the original dataset. for fair comparison, we divide gt numbers by 4.5, so that we can compare the gt instances distribution before and after sampling.
May I ask where the data sampling idea come from? It seems an interesting idea to solve the imbalance problem in dataset.
- there're 10 classes in total, we expect that each class take account for about 10% of the total samples. but in fact, there exists objects of various categories in one sample, so increase gt numbers of rare classes will also increase that of major classes. so the final distribution is not a perfect uniform distribution.
- after the sampling procedure, the dataset is actually 4.5 times larger than the original dataset. for fair comparison, we divide gt numbers by 4.5, so that we can compare the gt instances distribution before and after sampling.
May I ask where the data sampling idea come from? It seems an interesting idea to solve the imbalance problem in dataset.
I thought of it by accident
Is there an ablation study about the improvement of the proposed sampling? It might benefit the community a lot if you can provide some details about it. :) @poodarchu
In Section 2.1 Input and Augmentation
I am confused by "So we randomly sample 10% of 128106 (12810) point cloud samples for each category from the class-specific samples mentioned above." What does this mean? It seems that no more explanation was given in the following part.
And as for the Figure2, is it true that the blue colum and orange colum use the same vertical axis?