ray-project / ray_beam_runner

Ray-based Apache Beam runner
Apache License 2.0
41 stars 12 forks source link

Standardize how we fix serialization problems for Protobufs #29

Closed pabloem closed 1 year ago

pabloem commented 2 years ago

From review:

Nit / general remark: It's a bit unfortunate that we keep running into the serialization issue, and sometimes solve it by using a custom reduce, sometimes by registering a custom serializer (ray.util.register_serializer), and sometimes manually (SerializeToString / FromString). It would be good if we could enable serialization for all the protobuf components in one central place - I'm not sure how that could be done though, as ray.util.register_serializer would have to be called on every ray worker that transmits protobuf objects. Maybe something to discuss?

wilsonwang371 commented 2 years ago

Hi Pablo,

Can you point out some of the places of using different serialization/deserialization ?

iasoon commented 2 years ago

Example for each: __reduce__: https://github.com/ray-project/ray_beam_runner/blob/master/ray_beam_runner/portability/execution.py#L454 manual conversion: https://github.com/ray-project/ray_beam_runner/blob/master/ray_beam_runner/portability/execution.py#L65 SerializeToString: https://github.com/ray-project/ray_beam_runner/blob/master/ray_beam_runner/portability/execution.py#L144 register_serializer: https://github.com/ray-project/ray_beam_runner/blob/master/ray_beam_runner/portability/execution.py#L375

rkenmi commented 1 year ago

Hi @iasoon ,

Just wanted to understand a bit more about the issue 🙂

It seems like there are some pros and cons with each serialization approach:

Curious for your opinion on keeping a mix of them or stick to one convention of using __reduce__ or ray.util.register_serializer for consistency?

iasoon commented 1 year ago

@rkenmi Personally, I definitely prefer the register_serializer approach, as it is more universal - we don't always serialize protobuf objects as part of an owning object, but also just as regular task arguments.

A while back, I took a stab at implementing this for all protobuf types we depend on: https://github.com/pabloem/ray_beam_runner/pull/1/commits/3445540af3e1515785b898d86f9708b175eae8e3#diff-7507ff302fc9e22524c6fea6648354d4dc21c5076957b063efc296acf929f742 I'm not really sure what the best way would be to include this in our code. I guess the most straightforward way would be to indeed add a manual call to each task that needs to serialize protobuf messages. I imagine this could get tedious though.

There is also this effort that would help us, but I don't know where that went. https://github.com/ray-project/ray/pull/21383

I definitely agree that having custom __reduce__s is a good idea for some of the structs we have, but I think that's orthogonal to the serialization of the protobuf messages themselves. We could definitely do both.

Hope that helps! Happy to discuss more.

rkenmi commented 1 year ago

Nice, thanks for sharing the snippet!

I'm not really sure of a good way either for propagating register_serializer to all workers. The manual call seems okay with only a handful of Ray tasks/actors right now.

I did find this API which looks useful, but they recently marked it as deprecated due to reliability issues.

iasoon commented 1 year ago

I'm not sure either. Too bad that that API was deprecated! Maybe we can ask the ray team whether there is a recommended way to do this?

I think technically we don't need it on any worker either, but only the ones that serialize protobuf messages. If my memory serves well, I think that's currently only the main task and the run_bundle tasks.