The difference between the scores in the paper and in Github

LinLin1031 commented 1 year ago

Hello! Thank you for your excellent work. I would like to ask a question.

For the S3DIS dataset, I noticed that Table II in the paper shows the scores on Area5 and 6-fold CV.

However, in the code, "s3dis_from_scratch.sh" is used to train and predict each Area individually by setting the value of the parameter CURR_AREA, and the results are as shown in Github. This makes me wonder if this means that "s3dis_from_scratch.sh" can only train and predict for a specific Area, but not reproduce the scores of Area5 and 6-fold CV in the paper.

Best.

JonasSchult commented 1 year ago

Hi!

In order to reproduce the scores for Area5 you need to set CURR_AREA to 5. 6-fold CV is the average over all individually held-out areas, i.e., you need to train 6 models (leave out 1 of the 6 areas for each of the individual trainings) and average their results.

Best, Jonas

LinLin1031 commented 1 year ago

Hello!

Thank you for your prompt reply. When CURR_AREA is set to a specific value, does the code automatically divide all data in that AREA into training set, validation set, and test set? To be honest, I'm not very knowledgeable, and I feel a little bit complicated about the organization and functionality of your code file. So please favor me with your instruction. Thanks a lot!

Best.

JonasSchult commented 1 year ago

Hi!

No problem! :)

For example: If you want to reproduce the scores for Area5, you need to set CURR_AREA=5 (here). Everything is then done by the codebase (preparing the training set and validation set, ...)

Let me know if you need further details! :)

Best, Jonas

LinLin1031 commented 1 year ago

Hello!

Thank you for your patient answer. Is the way you mentioned above for calculating Area5 and 6-fold CV scores a bit biased from the common way of calculating them? In my humble understanding, does your calculation completely cut off the 6 areas? In the end, I still don't quite understand why the paper chose to train, validate and test each area separately.

In some related papers I've read: for the Area5 score, they use Area5 as the test set and the remaining 5 areas as the training and validation sets. Then the Area5 score is obtained by testing Area5 using the training result. For the 6-fold CV score, they perform 6 cross-validation by using all 6 areas of S3DIS as input. Each time, one area is set as the validation set and the remaining areas as the training set. Finally, the scores obtained from the 6 validations are averaged to obtain the 6-fold CV scores.

This approach is very different from this paper. If my understanding is wrong, i would appreciate your correcting my behavior. Thank you!

Best.

JonasSchult commented 1 year ago

Hi!

Thanks for your question! :) I think we mean exactly the same thing.

You wrote:

Each time, one area is set as the validation set and the remaining areas as the training set. Finally, the scores obtained from the 6 validations are averaged to obtain the 6-fold CV scores.

This is exactly how it is also done in our codebase.

Best, Jonas

LinLin1031 commented 1 year ago

Hello!

Thank you for your answer! I seem to understand that in "s3dis_from_scratch.sh", the value of the parameter CURR_AREA means that the Area with that serial number will be used as the validation set, and the other Areas will be used as the training set. Instead of saying that both the training and validation phases will be performed on the Area with that serial number (which was my original misunderstanding). In that case, I think this setup allows more flexibility to change the Area that is used as the training and validation set (of course, only 1 of the 6 Areas can be selected as the validation set at a time). This really dawned on me!

So it seems that if I want to try to train and validate the custom dataset in the S3DIS way, I should divide it into at least two Areas following the S3DIS organization structure, in order to select one Area as the validation set and the others as the training set. Is this a correct understanding?

Also, if the custom dataset is for a larger scene (e.g. point cloud for a large outdoor scene), is it possible to make it work smoothly by increasing the value of the parameter crop_length? (This is what I learned from checking the previous issue) Do you have any other suggestions?

Best.

LinLin1031 commented 1 year ago

@JonasSchult I look forward to your answers in your busy schedule. Thank you!

JonasSchult commented 1 year ago

Hi!

So it seems that if I want to try to train and validate the custom dataset in the S3DIS way, I should divide it into at least two Areas following the S3DIS organization structure, in order to select one Area as the validation set and the others as the training set. Is this a correct understanding?

yes, that's correct! :)

Also, if the custom dataset is for a larger scene (e.g. point cloud for a large outdoor scene), is it possible to make it work smoothly by increasing the value of the parameter crop_length? (This is what I learned from checking the previous issue) Do you have any other suggestions?

For your custom dataset, you can play with the crop parameters, e.g. setting the crop_length parameter to a higher value here. Don't forget to enable cropping here. Your hydra overwrites could look like this: data.cropping=true data.crop_length=11.0

Let me know if you need more details :)

Best, Jonas

JonasSchult / Mask3D

The difference between the scores in the paper and in Github #61