Within the content model, we want to know, for each possible child element, "whether they are required, and how many are allowed (0 or 1, exactly 1, 1 or more, 0 or more)." For element-type content models, this is not completely trivial.
Algorithm
For each leaf node,
Assign starting implicit quantifiers min=1, max=1
foreach soa in (self-or-ancestor):
if soa is a choice, min ← 0
q='?', min ← 0
q='+', max ← ∞
q='*', min ← 0, max ← ∞
For each distinct child element name, sum up the individual leaf-node mins and maxes.
Example
Let the content model be ( a, (b|c), a+, (d|a) )
"a" appears three times in this model. Let's call the first leaf node of "a", {a1}, the second, {a2}, and the third, {a3}. Then, in the first part of the algorithm, we find that the the implicit quantifiers are:
{a1}: min=1, max=1 (this is part of a sequence, which has no effect on the quantifiers)
{a2}: min=1, max=∞ (because of the "+" quantifier on the "a")
{a3}: min=0, max=1 (the inner group is a choice, so min ← 0)
Now sum these up, to get:
{a}: min=2, max=∞
In other words, in this content model, <a> must occur at least twice, and can occur an unbounded number of times.
Format
To record these results, we don't want to put the quantifiers on the nodes of the existing output, because those nodes occur more than once, and in a hierarchical structure. Instead, I'd suggest adding another section to the output, that looks like this:
<content-model spec="element" minified="(a,(b|c),a+,(d|a))"
spaced="( a, ( b | c ), a+, ( d | a ) )">
<seq>
<child>a</child>
<choice>
<child>b</child>
<child>c</child>
</choice>
<child q="+">a</child>
<choice>
<child>d</child>
<child>a</child>
</choice>
</seq>
<!-- This section provides a flattened list of all children, with summary quantifiers -->
<children>
<child min='2' max='inf'>a</child>
<child min='0' max='1'>b</child>
<child min='0' max='1'>c</child>
<child min='0' max='1'>d</child>
</children>
</content-model>
Problem
Within the content model, we want to know, for each possible child element, "whether they are required, and how many are allowed (0 or 1, exactly 1, 1 or more, 0 or more)." For element-type content models, this is not completely trivial.
Algorithm
Example
Let the content model be ( a, (b|c), a+, (d|a) )
"a" appears three times in this model. Let's call the first leaf node of "a", {a1}, the second, {a2}, and the third, {a3}. Then, in the first part of the algorithm, we find that the the implicit quantifiers are:
Now sum these up, to get: {a}: min=2, max=∞
In other words, in this content model, <a> must occur at least twice, and can occur an unbounded number of times.
Format
To record these results, we don't want to put the quantifiers on the nodes of the existing output, because those nodes occur more than once, and in a hierarchical structure. Instead, I'd suggest adding another section to the output, that looks like this: