File format specification?

I am wondering if there is any formal specification that each UMR file is supposed to follow. The guidelines in this repository give some idea (even if incomplete) about the sentence level graphs and document level graphs. But they do not say that there are four annotation blocks for each sentence (tokens, sentence graph, alignment, document graph), each block followed by an empty line, the last one by two empty lines etc.

I am writing a validation script for UMR and it would be probably easier to follow a specification (if it exists) than trying to guess from the data files what is allowed and what not.

BTW, the data in UMR release 1.0 seem to follow different conventions in different languages, also different from what the guidelines say, and occasionally they have issues that are clear bugs regardless specification (such as non-matching brackets).

umr4nlp / umr-guidelines

File format specification? #21