Open VaishnaviChelagola opened 5 hours ago
Ensure the issue is not similar or previously being worked on.Thanks for your time
@VaishnaviChelagola , all the best please ensure to star the repo your contribution is highly appreciated
Thank You sir!!
Is this a unique feature?
Is your feature request related to a problem/unavailable functionality? Please describe.
The stock price prediction model currently uses "train_test_split" to randomly split data, which might not be the best method for time-series data. The sequential nature of time-series stock data is ignored by this approach, which may result in data leakage and inaccurate model evaluation.
Proposed Solution
In order to enable the model to split the data sequentially, I wanted to add "TimeSeriesSplit" from scikit-learn. This approach maintains the temporal order by guaranteeing that training is done on past data and evaluation is done on future data.
Screenshots
No response
Do you want to work on this issue?
Yes
If "yes" to above, please explain how you would technically implement this (issue will not be assigned if this is skipped)
I'll change the dataset splitting procedure to make advantage of 'TimeSeriesSplit' and tweak the model training to accommodate multiple splits. In order to demonstrate how 'TimeSeriesSplit' enhances model performance on stock price data, I will also present comprehensive comparison metrics (such as RMSE and MAE) before and after the implementation. Steps: 1.Modify the data splitting logic to use TimeSeriesSplit. 2.Train the model on each split and calculate evaluation metrics. 3.Compare the results with the current random data split method. 4.Provide detailed documentation on how this feature improves the accuracy of predictions on time-series data.