Description - Githubissues

daishu-li commented 7 months ago

Could you share the description of the programming environment? It is important for us to reproduce the results.

mahi97 commented 7 months ago

Do you need the list of installed libraries?

daishu-li commented 7 months ago

Yeah, the libraries and the corresponding version

Do you need the list of installed libraries?

daishu-li commented 7 months ago

When I conduct the code "evofed.py", I got the error: <ValueError: Custom node type mismatch: expected type: <class 'flax.core.frozen_dict.FrozenDict'>, Traceback (most recent call last): File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 161, in run() File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 156, in run manager.run(rngrun) File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 113, in run clients, , _ = jax.vmap(sl.train_epoch, in_axes=(None, 0, 0, None, None))(self.state, File "D:\vscode\fed-des\fedes\EvoFL\backprop\sl.py", line 206, in train_epoch state, loss, accuracy = train_step(state, X_batch, Y_batch, rng_net) File "D:\vscode\fed-des\fedes\EvoFL\backprop\sl.py", line 88, in train_step state = state.apply_gradients(grads=grads) File "F:\anaconda\envs\torch\lib\site-packages\flax\training\train_state.py", line 101, in apply_gradients updates, new_opt_state = self.tx.update(

I think it may be the wrong version of the evosax or jax.

mahi97 commented 7 months ago

Sorry for late reply, I will get back to you soon. but for now this is my version of the libraries:

jax~=0.4.13 tqdm~=4.66.1 wandb~=0.15.8 evosax~=0.0.9 optax~=0.1.7 flax~=0.7.2 numpy~=1.24.3 gymnax~=0.0.6 chex~=0.1.7 brax~=0.0.12 dataclasses~=0.6 setuptools~=58.0.4 matplotlib~=3.4.3

wang-517 commented 3 days ago

When I conduct the code "evofed.py", I got the error: <ValueError: Custom node type mismatch: expected type: <class 'flax.core.frozen_dict.FrozenDict'>, Traceback (most recent call last): File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 161, in run() File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 156, in run manager.run(rngrun) File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 113, in run clients, , _ = jax.vmap(sl.train_epoch, in_axes=(None, 0, 0, None, None))(self.state, File "D:\vscode\fed-des\fedes\EvoFL\backprop\sl.py", line 206, in train_epoch state, loss, accuracy = train_step(state, X_batch, Y_batch, rng_net) File "D:\vscode\fed-des\fedes\EvoFL\backprop\sl.py", line 88, in train_step state = state.apply_gradients(grads=grads) File "F:\anaconda\envs\torch\lib\site-packages\flax\training\train_state.py", line 101, in apply_gradients updates, new_opt_state = self.tx.update(

I think it may be the wrong version of the evosax or jax.

Have you succeeded in reproducing? If it's convenient, can you add a contact information to communicate? Thank you！

daishu-li commented 3 days ago

When I conduct the code "evofed.py", I got the error: <ValueError: Custom node type mismatch: expected type: <class 'flax.core.frozen_dict.FrozenDict'>, Traceback (most recent call last): File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 161, in run() File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 156, in run manager.run(rngrun) File "D:\vscode\fed-des\fedes\EvoFL\evofed.py", line 113, in run clients, , _ = jax.vmap(sl.train_epoch, in_axes=(None, 0, 0, None, None))(self.state, File "D:\vscode\fed-des\fedes\EvoFL\backprop\sl.py", line 206, in train_epoch state, loss, accuracy = train_step(state, X_batch, Y_batch, rng_net) File "D:\vscode\fed-des\fedes\EvoFL\backprop\sl.py", line 88, in train_step state = state.apply_gradients(grads=grads) File "F:\anaconda\envs\torch\lib\site-packages\flax\training\train_state.py", line 101, in apply_gradients updates, new_opt_state = self.tx.update( I think it may be the wrong version of the evosax or jax.

Have you succeeded in reproducing? If it's convenient, can you add a contact information to communicate? Thank you！

Not, I can not reproduce it, the error always exists.

mahi97 commented 1 day ago

I'm sorry for not getting back to you sooner; I think the problem was with Flax FrozenDict. I updated the source; you should be able to run it without issues.

Please let me know if you need any other help.

I am also working on another project that includes various communication-efficient FL methods (including evofed) with better source code. I will update you guys as soon as it gets public.

mahi97 / EvoFL

Description #2