Open goodmami opened 5 years ago
I think we might leave it as a python dictionary for the moment, and concentrate on using and extending it.
Converting to TOML looks like it may make it easier to edit down the road.
For now I've settled on having data structures that behave like dictionaries or classes in that they allow for both key-lookup (e.g. rels['hypernym']['name']['en']
) and dot-access (rels.hypernym.name.en
). The former is useful when you have the relation or property name in a variable and prefer rels[relation]
over getattr(rels, relation)
while the latter is much simpler and makes editing the file easier. I also made the data structures raise errors on invalid keys/attributes and defined inventories of valid relations, forms, projects, languages, etc., in order to reduce errors caused by simple typos.
I'll leave this issue open as a feature request for future versions.
We need to decide a good way to store the gwadoc data, but it's not yet clear what are the intended uses or who are the intended users beyond generating the HTML documentation. The current (not checked-in) data is a python file that fills dictionaries with data. If generating documentation is the only use, we may as well put it directly into restructuredText. If we want a Python API, e.g., to request the localized name, definition, reverse, etc. from OMW, then it might make sense to make Python classes (Sphinx's autodoc could possibly be used to generate the docs, then).
In either case we could store the data in a data file and transform it (perhaps with validation) into the target representation. I propose using TOML. Even though it is relatively new and not in the standard library, it was chosen for Rust's package manager and for the future of Python packaging (see PEP-0518), so it has support by major projects.
Here's a what (part of)
hypernym
would look like:There's some flexibility in TOML (but not as flexible as YAML, which is a good thing). Something like this would be equivalent, e.g., if you want to group all attributes by language:
And while I would like to place this file (
gwadoc.toml
or whatever) at the top level so it's more prominent for non-Python users/contributors, that would make it much more difficult to distribute with the project and for the python code to find when run. So it might go undergwadoc/gwadoc.toml
instead.As an alternative, if we don't care much about non-Python users, we could make a Python class like
Relation
and do things like this:Then query it like this: