FederatedAI / FATE-Builder

A release packing tool for FATE
Apache License 2.0
7 stars 13 forks source link

Parameter to MergeFrom() must be instance of same class when building images from docker-build/build.sh #15

Closed jsuper closed 1 year ago

jsuper commented 1 year ago

After build images from source using FATE-builder/docker-build/build.sh, building stage can succefully finish, but couln't submit job when starting the fate fluster using custom images. building image step as follow

1、Clone source code and checkout v1.8.0 version

git clone repo-url --recursive-submodule && git checkout v1.8.0  && git submodule update

2、Excuting building process using follow command.

export TAG=1.8.0-release 
cd FATE-Builder-ROOT/docker-build/ && sh build.sh all

After build successfully, starting FATE cluster by docker-compose, then submit toy example. Following error will throws.

Fully exception stacks:

Traceback (most recent call last):
  File "/data/projects/fate/fateflow/python/fate_flow/scheduler/federated_scheduler.py", line 278, in federated_command
    federated_mode=federated_mode)
  File "/data/projects/fate/fateflow/python/fate_flow/utils/api_utils.py", line 78, in federated_api
    dest_party_id=dest_party_id, json_body=json_body, api_version=api_version, overall_timeout=overall_timeout)
  File "/data/projects/fate/fateflow/python/fate_flow/utils/api_utils.py", line 162, in federated_coordination_on_grpc
    overall_timeout=overall_timeout)
  File "/data/projects/fate/fateflow/python/fate_flow/utils/grpc_utils.py", line 49, in wrap_grpc_packet
    _src = proxy_pb2.Topic(name=job_id, partyId="{}".format(src_party_id), role=FATE_FLOW_SERVICE_NAME, callback=_src_end_point)
  File "/opt/app-root/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 552, in init
    _ReraiseTypeErrorWithFieldName(message_descriptor.name, field_name)
  File "/opt/app-root/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 477, in _ReraiseTypeErrorWithFieldName
    raise exc.with_traceback(sys.exc_info()[2])
  File "/opt/app-root/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 550, in init
    copy.MergeFrom(new_val)
  File "/opt/app-root/lib/python3.6/site-packages/google/protobuf/internal/python_message.py", line 1314, in MergeFrom
    _FullyQualifiedClassName(msg.__class__)))
TypeError: Parameter to MergeFrom() must be instance of same class: expected basic_meta_pb2.Endpoint got basic_meta_pb2.Endpoint. for field Topic.callback

UPDATED Using build_cluster_docker.sh in https://github.com/FederatedAI/FATE/tree/v1.8.0/build/docker-build meet the same problem.

Steps:

1、clone code same with previous 2、modify fate.env and docker-build/.env to custom version 3、starting build

cd docker-build && sh build_cluster_docker.sh all

4、Waiting finished, then using docker-compse to startup a cluster(fate on eggroll) 5、excute single side toy example, then exception occurs

flow test toy -gid 10000 -hid 10000
owlet42 commented 1 year ago

FATE-Builder does not support building FATE v1.8. Please refer to this to build FATE v1.8. https://github.com/FederatedAI/FATE/tree/v1.8.0/build/docker-build

jsuper commented 1 year ago

FATE-Builder does not support building FATE v1.8. Please refer to this to build FATE v1.8. https://github.com/FederatedAI/FATE/tree/v1.8.0/build/docker-build

Hi @owlet42 Thanks for your very much.

I try to use the doker-build script which mentioned from you. The result is same as the issue described.

I am not sure, should I need to re-genereted all protobuf file before build images?

jsuper commented 1 year ago

Issue fixed

This error is caused by incorrect protobuf version, FATE requirements.txt does not limited the maxmium version of protobuf package. protobuf 4.21.4 will be installed during build stage. This will cause the type check error which described in issue.

How to fix:

Modify FATE/python/requirments.txt, find line: protobuf>=3.6.1, then change it to follow:

protobuf>=3.6.1,<3.20.3