Why the accuracy of train is high and the result of val is poor

Hello, I am a newbie in deep learning. I would like to ask, I use the part_seg program to classify a large-scale urban point cloud data set (500m*500m). The training data is the data of the assigned categories in this data set, and the verification data is part of this data set. I divide the input training data into a label (city), and then divide it into four parts (ground, wall, roof, vegetation)

training point number: 300,000 total point number: 2,000,000 val point number:150,000

During the training process, the accuracy of train continuously increased to 90%, and the loss continued to decrease to 0.4. I understand this accuracy rate is the category predicted by the train data/input category of the train data. However, the accuracy and loss of val have no significant trend, the accuracy is only 45% and fluctuates constantly, and the loss fluctuates around 2.

At the same time, no matter the characteristics of the input data are XYZ, XYZRI, XYZRID (XYZ, Return number, Intensity, Density of points), the final result is similar

What caused this? Because only looking at the process of train, very good results are obtained, but val is very poor. Any suggestions for improvement? Or should I use sem_seg instead of part_seg?

Thanks in advance.

charlesq34 / pointnet

Why the accuracy of train is high and the result of val is poor #258