Germany #makes nice cars
German #is a language #from Germany
Italian #is a language #from Germany
(Italian #is a language #from Germany) #is disputed
The above presentation asks the reader to read a lot of redundant phrases -- "Germany" appears 4 times, and "is a language from Germany" appears three times. Some of the data is about Germany, and some of it is about the Italian language.
Spatial organization can reduce such redundancy.
A solution: Grouping, and the special symbols «it» and «these»
Some expressions will be visual centers of gravity. They will be decorated by prepending ▣ (the unicode symbol 25a3).
The symbol «these» refers to the nodes indented exactly one layer below the expression in which «these» appears. Given a visual center C and some expression E indented at any depth below C, the symbol «it» refers to the path from the center to E, including C but excluding E.
For instance, in the following data:
1 ▣ Germany
2 «it» #makes nice cars
3 «these» #is a language #from «it»
4 German
5 Italian
6 «it» #is disputed
The «it» in line 2, and the «it» in line 3, both refer to line 1, "Germany". However, the «it» in line 6 refers to the entire path from the center to there, i.e. the statement "Italian is a language from Germany".
By contrast, if we had this:
1 ▣ Germany
2 «it» #makes nice cars
3 «these» #is a language #from «it»
4 German
5 ▣ Italian
6 «it» #sounds flamboyant
then the «it» in 6 would refer to line 5, because line 5 is the closest ancestor marked as a center.
Factoring out common relatinoships
In the above solution, notice that «these» #is a language #from «it» corresponds to no particular expression in the knowledge graph. The user should not have to specify that kind of factoring; it should be discovered automatically.
In some cases, more than one factoring will be possible.
The problem
Consider the following data:
The above presentation asks the reader to read a lot of redundant phrases -- "Germany" appears 4 times, and "is a language from Germany" appears three times. Some of the data is about Germany, and some of it is about the Italian language.
Spatial organization can reduce such redundancy.
A solution: Grouping, and the special symbols «it» and «these»
Some expressions will be visual centers of gravity. They will be decorated by prepending ▣ (the unicode symbol 25a3).
The symbol «these» refers to the nodes indented exactly one layer below the expression in which «these» appears. Given a visual center C and some expression E indented at any depth below C, the symbol «it» refers to the path from the center to E, including C but excluding E.
For instance, in the following data:
The «it» in line 2, and the «it» in line 3, both refer to line 1, "Germany". However, the «it» in line 6 refers to the entire path from the center to there, i.e. the statement "Italian is a language from Germany".
By contrast, if we had this:
then the «it» in 6 would refer to line 5, because line 5 is the closest ancestor marked as a center.
Factoring out common relatinoships
In the above solution, notice that
«these» #is a language #from «it»
corresponds to no particular expression in the knowledge graph. The user should not have to specify that kind of factoring; it should be discovered automatically.In some cases, more than one factoring will be possible.