Closed Amrusama closed 1 week ago
Please differentiate between the "entire set" and the "training set." The training set is almost 75% of the entire set for each client.
@TsingZ0 Thank you for the explanation. I suggest including this information in the dataset generation section. For instance, the entire training set of FashionMNIST comprises 70,000 samples. My intention was for PFLib to provide a non-iid version of the entire training set.
The test set can be reshuffled after it has been split.
I generated a non-iid version of FashionMINST for 15 clients using the following command
python generate_FashionMNIST.py noniid - pat
The output of the data distribution on the terminal is as follows:I printed the number of data points for each class in every client and I plotted the distribution of every client class and it didn't match the output on the terminal.