Closed codaibk closed 3 years ago
Hi, Is there any update or suggestion to fix this issue? because your cifar10 setting is only ok with 10 clients but not for flexible number of client. @ZacharyGarrett
Hi, Is there any update or suggestion to fix this issue? because your cifar10 setting is only ok with 10 clients but not for flexible number of client. @ZacharyGarrett
Hi,
I think there is a typo in the utils/datasets/cifar10_dataset.py
file. The code on line 101 seems wrong.
https://github.com/google-research/federated/blob/42ec49634d9d27d0ac5d16820271d6d2cc5b55b9/utils/datasets/cifar10_dataset.py#L101
The correct one should be
for k in range(NUM_CLIENTS):
Thanks for investigating @xjiajiahao! Would you be willing to submit a pull request to make the change?
@xjiajiahao @ZacharyGarrett Change only that line will not fix problem because the code determine train_client_samples
based on "train_example_indices" index:
train_client_samples[k].append( train_example_indices[sampled_label, train_count[sampled_label]])
and train_example_indices size is set based on number of examples each class (5000 for train, 1000 for test)
When you change NUM_CLIENTS
=> NUM_EXAMPLES_PER_CLIENT
and TEST_SAMPLES_PER_CLIENT
will be changed too. This one will make the error.
`for k in range(NUM_CLIENTS):
for i in range(NUM_EXAMPLES_PER_CLIENT):
sampled_label = np.argwhere(
np.random.multinomial(1, train_multinomial_vals[k, :]) == 1)[0][0]
train_client_samples[k].append(
train_example_indices[sampled_label, train_count[sampled_label]])
train_count[sampled_label] += 1
if train_count[sampled_label] == NUM_EXAMPLES_PER_CLIENT:
train_multinomial_vals[:, sampled_label] = 0
train_multinomial_vals = (
train_multinomial_vals /
train_multinomial_vals.sum(axis=1)[:, None])`
NUM_CLIENTS
< 10. The error is :`IndexError: index 5000 is out of bounds for axis 1 with size 5000NUM_CLIENTS
> 10. The error is: np.random.multinomial(1, train_multinomial_vals[k, :]) == 1)[0][0] File "mtrand.pyx", line 4212, in numpy.random.mtrand.RandomState.multinomial File "_common.pyx", line 338, in numpy.random._common.check_array_constraint ValueError: pvals < 0, pvals > 1 or pvals contains NaNs
@hsidahmed865 has kindly offered to take a look and potentially submit a fix, as they have been bumping up against this. Thanks @hsidahmed865!
Hi @codaibk. This issue should have been fixed by commits 74fdc1680c33169714f577cdc3398c94d0326aff and 83b23c36a5fd29c4c89631a125d54947074699b4. Can you confirm whether or not this fixed your problem?
Sorry, but it does not fix the problem. The problem is that this code can't deal with flexible number of clients like I mentioned above. And you guys commits here don't change anything about algorithm but just only change the parameters. @zcharles8
Hi @codaibk. Can you verify that your version of the repository includes the commit I listed above? They have added the functionality to allow the user to specify num_clients
.
If so, can you run the following test using bazel
: https://github.com/google-research/federated/blob/master/utils/datasets/cifar10_dataset_test.py
This test is passing for me, and explicitly tests num_clients = 8
, num_clients = 10
, and num_clients = 100
.
@zcharles8 . it seems you guys changed the run file run_federated.py
in differential privacy folder too. The old file will call cifar10_dataset.py
for generating data.
could you tell me what is the command for program running with cifar10_dataset_test.py
?
Thanks.
We recommend using Bazel (see https://bazel.build/). Once you have that configured, you can simply run bazel test {path to test}:{test_name}
in order to run a test.
If you'd prefer to not use Bazel, you could run cifar10_dataset.load_cifar10_federated
with different numbers of num_clients
arguments, and make sure that you get a dataset with the requisite number of clients.
Hi @codaibk. I am marking this as resolved for now, as it is working according to all of our tests. If you are still seeing errors, please post your full stack trace, as well as the commands that resulted in the error.
Hi, I am running Federated Learning with differential privacy folder. It seems the cifar10 is only ok with 10 clients (number of client = number of class). When changing the number of client different with number of class ( example: number of client =20 while number of class in cifar10 = 10). The system get error. I think it is necessary to edit the code with flexible number of client. Do you have any suggestion? I changed the number of client using bellow code:
And It get this error
Thanks.