Open xlw686 opened 2 years ago
@xlw686 you set the worker number as 10, which may be out of your local machine's memory?
I changed the worker number to 1,and then run the shell below:
sh run_fedavg_distributed_pytorch.sh 1000 1 lr hetero 200 1 10 0.03 mnist "./../../../data/mnist" sgd 0
Below is an error message:
(fedml) root@VM-24-3-ubuntu:~/share/FedML/fedml_experiments/distributed/fedavg# sh run_fedavg_distributed_pytorch.sh 1000 1 lr hetero 200 1 10 0.03 mnist "./../../../data/mnist" sgd 0
2
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
/root/anaconda3/envs/fedml/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: greenlet.greenlet size changed, may indicate binary incompatibility. Expected 144 from C header, got 152 from PyObject
return f(*args, **kwds)
Traceback (most recent call last):
File "./main_fedavg.py", line 43, in <module>
from fedml_api.distributed.fedavg.FedAvgAPI import FedML_init, FedML_FedAvg_distributed
File "/root/share/FedML/fedml_api/distributed/fedavg/FedAvgAPI.py", line 1, in <module>
from mpi4py import MPI
File "/root/share/mpi4py.py", line 1, in <module>
from mpi4py import MPI
ImportError: cannot import name 'MPI' from 'mpi4py' (/root/share/mpi4py.py)
Traceback (most recent call last):
File "./main_fedavg.py", line 43, in <module>
from fedml_api.distributed.fedavg.FedAvgAPI import FedML_init, FedML_FedAvg_distributed
File "/root/share/FedML/fedml_api/distributed/fedavg/FedAvgAPI.py", line 1, in <module>
from mpi4py import MPI
File "/root/share/mpi4py.py", line 1, in <module>
from mpi4py import MPI
ImportError: cannot import name 'MPI' from 'mpi4py' (/root/share/mpi4py.py)
The display cannot import MPI from mpi4py, but I can do it like the following:
ImportError: cannot import name 'MPI' from 'mpi4py' (/root/share/mpi4py.py)
(fedml) root@VM-24-3-ubuntu:~/share/FedML/fedml_experiments/distributed/fedavg# python
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mpi4py
>>> from mpi4py import MPI
>>> comm = MPI.COMM_WORLD
>>> process_id = comm.Get_rank()
>>> print(process_id)
0
>>>
I don't know what went wrong😂
the worker number should be at least 3.
The worker number changed to 3, which is no different:
sh run_fedavg_distributed_pytorch.sh 1000 3 lr hetero 200 1 10 0.03 mnist "./../../../data/mnist" sgd 0
Below is an error message:
Traceback (most recent call last):
File "./main_fedavg.py", line 43, in <module>
from fedml_api.distributed.fedavg.FedAvgAPI import FedML_init, FedML_FedAvg_distributed
File "/root/share/FedML/fedml_api/distributed/fedavg/FedAvgAPI.py", line 1, in <module>
from mpi4py import MPI
File "/root/share/mpi4py.py", line 1, in <module>
from mpi4py import MPI
ImportError: cannot import name 'MPI' from 'mpi4py' (/root/share/mpi4py.py)
the worker number should be at least 3.
@xlw686 is this issue solved in the latest version?
@xlw686 Can you run your example using the latest dev branch?
I am experimenting with the tutorial below
Something seems to have gone wrong,No results were obtained