Closed Ytang520 closed 1 year ago
While it is possible that the final predicted score could exceed the allowed limits and lead to incorrect predictions, there are specific reasons why the implementation is set up this way. One possible reason is that the normalization and activation functions were used for numerical stability and to enable the neural network to learn complex relationships between the input features and output scores effectively. Additionally, the unbounded output predictions can be transformed to the allowed range (e.g., [0-50]) during evaluation or deployment. This approach enables the neural network to learn from the entire range of data, not just the allowed range, which can lead to better overall performance.
I am puzzled by your use of Standard Normalization on the y_label (score), as well as the linear activation function in the final dense layer (as shown in the picture below). Given that this is a score prediction problem, and scores are limited to [0,1] on the UI-PRIMD dataset or [0,50] on the Kimore dataset, it is possible that the final predicted score could exceed these limits. Is there a reason for this implementation that I may have missed?