Changes made to the original function train_test_split to also include randomly shuffling data before segregating the tuples into training and testing sets.
df = df.sample(frac=1, random_state=1) shuffles the rows of the DataFrame df in place.
frac=1 ensures that all rows are sampled (no duplicates removed).
random_state=1 sets a seed for the random number generator to make shuffling reproducible.
Find the length of dataframe and then store the test_index.
Distribute the tuples below 'train_index' to X_train and the tuples above 'train_index' and the tuple indexed as 'train_index' to X_test.
Extract the target feature from X_train, X_test to y_train and y_test respectively.
Changes made to the original function train_test_split to also include randomly shuffling data before segregating the tuples into training and testing sets.
df = df.sample(frac=1, random_state=1) shuffles the rows of the DataFrame df in place. frac=1 ensures that all rows are sampled (no duplicates removed). random_state=1 sets a seed for the random number generator to make shuffling reproducible.
Find the length of dataframe and then store the test_index.
Distribute the tuples below 'train_index' to X_train and the tuples above 'train_index' and the tuple indexed as 'train_index' to X_test.
Extract the target feature from X_train, X_test to y_train and y_test respectively.
Drop the target feature from X_train, X_test.
Return X_train, y_train, X_test, y_test\
Call the function train_test_split.
The Final Result: