kan-qi / UMLx

Analyze software architecture and generate insights about software development complexities, risks, and costs.
3 stars 0 forks source link

Neuralnetwork PCA analysis issue for prediction function #617

Open kan-qi opened 5 years ago

kan-qi commented 5 years ago

When running benchmark on neural network, there is an error as below

image

The problem is especially for bootstrapping part of the benchmark. Even though now issue with the successful running of CV. The results of both CV and boostrpping would be influenced.

The problem:

There is a pre-processing process in the model training step, which employ PCA.

image

The coordinates are transformed into PCA and only 6 principal components are saved.

image

In the prediction method, same process (including PCA) is applied to the testing data. However, the resulting PCs are not results from the same transformation decided in the training step.

image

The better way should be to keep the transformation (from the PCA) and apply the transformation to the testing set. A similar example can be found from:

https://www.datacamp.com/community/tutorials/pca-analysis-r

image