jonadsimon / wonder-words-generator

Generates WonderWords puzzles
Apache License 2.0
2 stars 0 forks source link

Select words with large number of potentially-overlapping letters #29

Closed jonadsimon closed 2 years ago

jonadsimon commented 2 years ago

Noticed that it's easily to make dense highly-overlapping boards if using words from languages like Greek or Hindi which contain a large number of repetitive vowels (e.g. a's) because gives the words more easy opportunities for overlaps.

Can quantify this by using some sort of hypergraph formulation, e.g. which subsets of words could be potentially connected.

Another alternative is to use weighted edges, since a word with two a's is 2x easier to overlap with another a-word than one having only 1 a.

Need to think about this formulation more

jonadsimon commented 2 years ago

Where the letters occur within a word has significant bearing on this equation. In the Yoga puzzle, 5 words intersect on the same letter S, but ONLY because they all start in S: image

jonadsimon commented 2 years ago

Simplest approach:

  1. for each word pair, count the # of distinct ways (wrt orientation) that they could overlap
  2. plot this pairwise distribution

Potential issues: favors long words over short words, doesn't give a way to compare different-sized word sets, doesn't give a way to describe 3+ word overlaps, doesn't give a way to estimate the total # of letters covered by overlaps

jonadsimon commented 2 years ago

This simple approach is probably the right one. Long words has the potential to overlap with a large # of other words, but having too many of these limits the total number due to the 1.1-1.25x letter cap. Therefore the best mix is a small number of long words, and a large number of shorter words.

See if there is some roughly-consistent distribution being used among the WonderWords puzzles.