FederatedAI / KubeFATE

Manage federated learning workload using cloud native technologies.
Apache License 2.0
420 stars 222 forks source link

Docker compose deployment not working in Ubuntu 22.04 #905

Open TomsyPaul opened 1 year ago

TomsyPaul commented 1 year ago

What deployment mode you are use? docker-compose;

What KubeFATE and FATE version you are using? KubeFATE-1.11.2

MUST Please state the KubeFATE and FATE version you found the issue KubeFATE-1.11.2

What OS you are using for docker-compse or Kubernetes? Please also clear the version of OS.

Desktop (please complete the following information):

To Reproduce

When deploying to Ubuntu 22.04 no errors are shown and the containers are up and running. But the following to check FL on the same system, flow test toy --guest-party-id 10000 --host-party-id 10000 command times out..

owlet42 commented 1 year ago

Please check that this command flow server versions returns correctly.

TomsyPaul commented 1 year ago
[root@b24912b32a3f fate]# flow server versions
{
    "data": {
        "API": "v1",
        "CENTOS": "7.2",
        "EGGROLL": "2.5.1",
        "FATE": "1.11.2",
        "FATEBoard": "1.11.1",
        "FATEFlow": "1.11.1",
        "JDK": "8",
        "MAVEN": "3.6.3",
        "PYTHON": "3.8",
        "SPARK": "3.4.0",
        "UBUNTU": "16.04"
    },
    "retcode": 0,
    "retmsg": "success"
}
owlet42 commented 1 year ago

Are you deploying Spark or Eggroll FATE, If it is Spark, please check whether the allocated resources are sufficient, compute_core>=8, If it is Eggroll, please check other components and there is no error log.

TomsyPaul commented 11 months ago

We use Eggroll. The Error (Algorithm) Log is attached.

Error Log from Fateboard.txt

Screenshot from 2023-10-04 17-30-04

owlet42 commented 10 months ago

Check the nodemanager log, enter the container, and check the log file with a similar name to task-xxxxx.