Updated Train Test Split to Divide Data Without using External Libraries #1

NeonKazuha commented 8 months ago

Changes made to the original function train_test_split to also include randomly shuffling data before segregating the tuples into training and testing sets.
df = df.sample(frac=1, random_state=1) shuffles the rows of the DataFrame df in place. frac=1 ensures that all rows are sampled (no duplicates removed). random_state=1 sets a seed for the random number generator to make shuffling reproducible.
Find the length of dataframe and then store the test_index.
Distribute the tuples below 'train_index' to X_train and the tuples above 'train_index' and the tuple indexed as 'train_index' to X_test.
Extract the target feature from X_train, X_test to y_train and y_test respectively.
Drop the target feature from X_train, X_test.
Return X_train, y_train, X_test, y_test\
Call the function train_test_split.
The Final Result:

NeonKazuha commented 8 months ago

@darshbaxi Please Review.

darshbaxi commented 8 months ago

The code looks good:)

iiitl / Regression