shuffled data in train_test_split

Bahraleloom / Bitcoin-Price-Prediction-Historical-Analysis-and-Future-Trends

This repository delves into the analysis and prediction of Bitcoin prices using various data science techniques.

0 stars 0 forks source link

Hi! Very interesting work! But I think you should disable shuffle when splitting data. Train_test_split shuffles data by default, you can inform shuffle=false to avoid future data context leakage into training. Financial time series should never be shuffled/randomized when split train/test. I see you got 75% classification accuracy maybe because leakage. Input shuffle=false and repeat to check. Thank you!

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

Bahraleloom / Bitcoin-Price-Prediction-Historical-Analysis-and-Future-Trends

shuffled data in train_test_split #1