Closed wrongu closed 8 years ago
Just re-familiarizing myself with the code and the format so I'll get back to you shortly.
So the output of the library seems to be a very literal abstract syntax tree of the SGF format.
The grammar of the SGF format down to the node level is:
Collection = GameTree { GameTree }
GameTree = "(" Sequence { GameTree } ")"
Sequence = Node { Node }
In other words, a collection is a list of one or more games trees and a game tree is both a sequence of one or more nodes AND a list of zero or more game trees. So that's why a GameTree
has both nodes
AND children
.
Looking at the first example in the SGF FF[4] spec, the list of nodes is the sequence of moves up until there are variations and then each variation is in one of the children.
So consider a very simple SGF game tree fragment: (;C[a];C[b](;C[c])(;C[d];C[e]))
The means a
then b
then two variations. The first variation continues with c
and the second with d
then e
.
So the current output of my library for that would be
GameTree
nodes:
a
b
children:
GameTree
nodes:
c
children:
GameTree
nodes:
d
e
If there are no variations, the GameTree should just have a sequence of nodes with no children. However if there are variations, one way to step through the main line is just iterate over the nodes
then recurse into the first child and iterate over the nodes
and so on.
There are, however, additional attributes on each node which provide useful information and a simpler alternative way of iterating over nodes...
previous
points to the previous node, so e's previous is d; d's previous is b, c's previous is b, b's previous is anext
points to the next node, but notably the first variation if there are more than one, so b's next is cfirst
is 1 if the node is the first in a game or variation, so a, c and d have a first of 1previous_variation
on the first node in a variation points to the first node of the previous variation, e.g. d's previous_variation is cnext_variation
on the first node in a variation points to the first node of the next variation, e.g. c's previous_variation is dvariations
on the last node before a split into variations is a list of the first node is each variation, e.g. b's variations is [c, d]So an alternative way to step through the mainline is just to follow next
on each node.
I hope this helps. I might update the README docs to better explain with this example.
There are definitely things I could add to the API to make it cleaner / easier to use. For example, I could make Collection implement the iterator protocol over GameTrees and make GameTrees implement the iterator protocol over nodes following next
(and hence the mainline). Because a Collection just has one type of child, I could also implement __getitem__
so collection[n]
would return the n-th game.
With this you could iterate the mainline of the first game in a collection as simply as:
for node in collection[0]:
...
I've released 0.2 with pretty much all the API changes mentioned at the end of my previous comment.
See the README.
Looking good! For the purposes of AlphaGo especially, variations aren't a concern and I'd be happy to get rid of sgflib
now.
The only thing that I still find confusing about them, though, is that GameTree.children
don't contain information about which node they are variations on (right?). Taking your (;C[a];C[b](;C[c])(;C[d];C[e]))
example, the two variations c
and de
are essentially alternatives to b
. I'd almost expect to see children
be an array parallel to and of the same size as nodes
, so that children[i]
is a variation on nodes[i]
. Or, I'd expect variations to be stored in the nodes themselves like nodes[i].variations = [GameTree, GameTree]
.
Maybe I just don't "get" the SGF specification, but that's my two cents =)
c
and de
aren't alternatives to b
, they are alternatives to each other, both following from b
.
In other words the two "games" are: abc
and abde
.
Does that address the confusion?
Here's a diagram from the spec which might help further:
@wrongu if you're happy with this explanation, I'll close this ticket
Not an 'issue' per se but a general confusion with how more complex files with variations are handled.
First, what are the
children
of aGameTree
?Second, for the sake of clarity it would be great to see an example of parsing (and interpreting) a file that has variations. Working through the
print1.sgf
example you provide, I found thatGameTree.nodes[i].variations[j].parent.nodes
is one way to see a full variation - is there a more direct way?More to the point: if I just look at the
children
of aCollection
and theirnodes
those should be all of the moves of the game excluding variations, right?