Upon analyzing the sample_per_class function in the code (around lines 188 and 189), I noticed that it seems to be causing a discrepancy in the size of the test set. According to the code logic, it results in the test set being approximately (1-20%) * 80% of the total data, rather than the stated 80% of the total data.
Upon analyzing the sample_per_class function in the code (around lines 188 and 189), I noticed that it seems to be causing a discrepancy in the size of the test set. According to the code logic, it results in the test set being approximately (1-20%) * 80% of the total data, rather than the stated 80% of the total data.