Add de/serialization capability

CAD97 commented 5 years ago

Add a simple "obviously correct" serde de/serialization impl behind the "serde" feature.

A node is represented as a singleton map from the syntax kind to an anonymous enumeration of either the leaf text or the branch's children. Due to the use of the anonymous enumeration, only self-describing formats can be deserialized from.

I suspect that this will be most useful for serializing to textual formats for comparison in tests, similar to the debug representation. (In fact, I have an adapter doing so in my usage.) Using a structured representation carries some benefit (editor folding support and such) and theoretically can enable more semantic diffing than a textual diff.

Omitting span information from the representation doesn't lose information as this representation instead represents the leaf text. This also means that inserting a new subtree doesn't change the representation before or after it, thus reducing the size of the diff.

Serialization and Deserialization are provided in the same file to help the obviousness of the implementation. Deserialization is mostly along for the ride with the Serialization, but could be dropped if not desired to be supported, as it does add significant complexity compared to the Serialize implementation.

An untagged enum is handled via serde_derive to avoid reimplementing that complexity locally, and to make it more likely to be correct.

(needs tests)

matklad commented 5 years ago

I think I am fine with it, but I’d love to hear more about motivation.

If the sole goal here is serialization to textual format for comparison, than I think walking the tree manually with postorder should be more convenient than going via serde.

matklad commented 5 years ago

Not that the serialization format is a public API, so changing the way we represent nodes will be a breaking change.

CAD97 commented 5 years ago

I'm dropping this PR: it's a lot less useful now that the node types are not-generic. I'm still + on that change, but it means that this has to be done by the consumer if it's wanted.

The reason to use serde rather than a manual postorder walk (or .to_string()) is that I want to be able to work with an actual data format that I can manipulate rather than something ad-hoc.

matklad commented 5 years ago

Added serialization in https://github.com/rust-analyzer/rowan/pull/27

rust-analyzer / rowan

Add de/serialization capability #11