JeffreyBenjaminBrown / digraphs-with-text

BSD 3-Clause "New" or "Revised" License
65 stars 5 forks source link

Grouping, and visual centers of gravity #4

Open JeffreyBenjaminBrown opened 6 years ago

JeffreyBenjaminBrown commented 6 years ago

The problem

Consider the following data:

Germany #makes nice cars
German #is a language #from Germany
Italian #is a language #from Germany
(Italian #is a language #from Germany) #is disputed

The above presentation asks the reader to read a lot of redundant phrases -- "Germany" appears 4 times, and "is a language from Germany" appears three times. Some of the data is about Germany, and some of it is about the Italian language.

Spatial organization can reduce such redundancy.

A solution: Grouping, and the special symbols «it» and «these»

Some expressions will be visual centers of gravity. They will be decorated by prepending ▣ (the unicode symbol 25a3).

The symbol «these» refers to the nodes indented exactly one layer below the expression in which «these» appears. Given a visual center C and some expression E indented at any depth below C, the symbol «it» refers to the path from the center to E, including C but excluding E.

For instance, in the following data:

1 ▣ Germany
2   «it» #makes nice cars
3   «these» #is a language #from «it»
4     German
5     Italian
6       «it» #is disputed 

The «it» in line 2, and the «it» in line 3, both refer to line 1, "Germany". However, the «it» in line 6 refers to the entire path from the center to there, i.e. the statement "Italian is a language from Germany".

By contrast, if we had this:

1 ▣ Germany
2   «it» #makes nice cars
3   «these» #is a language #from «it»
4     German
5     ▣ Italian
6       «it» #sounds flamboyant

then the «it» in 6 would refer to line 5, because line 5 is the closest ancestor marked as a center.

Factoring out common relatinoships

In the above solution, notice that «these» #is a language #from «it» corresponds to no particular expression in the knowledge graph. The user should not have to specify that kind of factoring; it should be discovered automatically.

In some cases, more than one factoring will be possible.