lutaml / expressir

Ruby parser for the ISO EXPRESS language
3 stars 3 forks source link

Support exporting and importing a parse tree cache #49

Closed w00lf closed 3 years ago

w00lf commented 3 years ago

From lutaml/lutaml-express#4:

We want to allow a defined set (via YAML or XML) of EXPRESS schemas available for Metanorma to navigate.

This requires Expressir to have the following functionality:

ronaldtse commented 3 years ago

@zakjan I heard from @w00lf there is some confusion on why this task is necessary.

The goal here is to create a serialized parse tree so that Expressir does not need to re-parse the EXPRESS schemas, e.g. a Marshall-ed instance of the parse tree in Ruby.

ISO 10303 contains hundreds of EXPRESS schemas -- as you know parsing all of them takes a long time. Since we need to load the entirety of these schemas in the compilation of the hundreds of parts of ISO 10303, each individually a document that will load this parse tree, we need to optimize the loading time of this parse tree.

Hence it is important for us to be able to parse-once slow, load-many times fast here.

Let me know if this is unclear. Thanks!

zakjan commented 3 years ago

I'm sorry, the confusion was not about why, but what's needed from me. With @w00lf we came up with this API:

filepath is the marshaled file, model is any Expressir model class

Parser needs to be called on the consumer side If filepath doesn't exist yet, or from_cache throws (generic Marshall.load exception, or current Expressir version != cache version).

I'll try Ruby Marshal.dump/load for the first implementation. If there are issues with it, we could switch to a different format (JSON/YAML, that's where I thought from/to_map methods would be useful).

ronaldtse commented 3 years ago

Ah! Maybe something like Cache.to_file and from_file.

Instead of Ruby’s Marshal (which embeds ruby classes), it would be better to use a proper YAML or a zipped YAML format (JSON also works but again prefer YAML for readability and compactness) in case for parsers in other languages.

zakjan commented 3 years ago

I understand the concerns about marshaling. YAML/JSON requires to implement from/to_map methods first, so it's going to take longer, I can have it ready in the end of this week.

ronaldtse commented 3 years ago

@zakjan okay, let's go with zipped YAML then, thanks.

zakjan commented 3 years ago

Note: there is already ModelElement.from_hash, and to_hash on all ModelElement instances