apache / fury

A blazingly fast multi-language serialization framework powered by JIT and zero-copy.
https://fury.apache.org/
Apache License 2.0
2.96k stars 218 forks source link

[debugtool] parsing the serialized bytes and show its binary structure in yaml format. And vice versa is also helpful. #1727

Open justincui opened 1 month ago

justincui commented 1 month ago

Is your feature request related to a problem? Please describe.

I'm always frustrated when debugging serialized data because it's difficult to visualize the binary structure. Understanding the detailed binary layout and content of serialized bytes would greatly enhance debugging and development efficiency.

Describe the solution you'd like

I would like a feature in the Fury debugtool component that can parse serialized bytes and display their binary structure in a human-readable YAML format (with depth control as a parameter). Additionally, the ability to convert the YAML representation back into the original binary format would be extremely helpful for verifying and testing changes during debugging. This bidirectional functionality would provide a robust tool for developers to work seamlessly with serialized data.

Additional context

Implementing this feature would bridge the gap between serialized data and its human-readable representation, making it easier to inspect, debug, and validate serialized objects. It would be particularly useful for developers working with complex data serialization and deserialization processes, enabling them to make changes in YAML and convert them back to binary for testing purposes.

chaokunyang commented 1 month ago

Thanks for propose this feature. This is very useful for debugging. With this feature, we can debug the binary data without the need of deserializing it with the type schema.

This is possible for meta shared compatible mode, Fury write all meta of a class into the binary data. So it's possible to decode the data and print it. We can implement this tool in python for xlang serialization. And for pure java serialization, we need to implement it in java, but it's possible to package this jar into python wheel and build a command line to parse the data using this jar.

For shema consistent mode, it's not possible to parse the bytes since Fury skip write some object meta. So we don't know how to parse the data.