performance degradation after using higher dimensional data

Hi, im trying to find edges in pointclouds. first i down-sampled my data set to 5k-sized pointclouds and got the following performance:

precision 0.9722454458109377 recall 0.9774251897764888 specifity 0.9976463102707986 f1score 0.9748284372091011

then i tried to down-sample my dataset to 10k pointclouds and was surprised that the performance decreased immensely: precision 0.6414813848515061 recall 0.4429522551934075 specifity 0.9649306344666823 f1score 0.5240442855918716

I have to add that in the 10k dataset i kept all my edge vertices. That means that instead of having one edge vertice out of every 30 vertices i have one edge vertice out of every 10 vertices. (not sure numbers are 100% correct but main thing is - used to have very very few edge vertices and now i have slightly more of them in the samples i feed the algorithm with)

Any idea what can cause the performance degradation? i would have expected that with more data and with more positive vertices the algorithm would learn better.

If i am not mistaken, all I had to change in the code was the "num_points" in the train function and load the 10k instead of the 5 k data.

Any help would be much appreciated

erikwijmans / Pointnet2_PyTorch

performance degradation after using higher dimensional data #59