Currently, I'm splitting data with the same function with the same seed (with a default seed set) across multiple scripts (whenever I'm classifying etc.). However, to avoid any potential mishaps, lets just save the data (esp. since we are also creating PCA models based on training data ONLY - a lot of steps where things can go wrong if we are splitting data all the time).
Currently, I'm splitting data with the same function with the same seed (with a default seed set) across multiple scripts (whenever I'm classifying etc.). However, to avoid any potential mishaps, lets just save the data (esp. since we are also creating PCA models based on training data ONLY - a lot of steps where things can go wrong if we are splitting data all the time).