Customising -p tree output

xanthe-cat commented 2 years ago

Dear Alex,

Thank you very much for contributing this project, I've been using it for assessing decision trees and sub-trees in contributing some solvers to Cyrus Freshman's solver leaderboard, which I imagine you know about. The format for the trees submitted to his project have to be considerably pared back (they have no indication of results obtained at any guess).

I have ~~two~~ one questions which you’re of course at liberty to ignore: is there a way to customise the -p tree output so that it prints the current branch nodes on each line as well as the leaf? In other words, looking at the first two lines of your CHOMP g7 hard mode tree, chomp BBBBB1 bezil BBBBB2 tardy BBBBB3 fusks BGBBG4 jujus BGBBG5 vuggs GGGGG6 (lots of blank spaces) GGGGG5 it would be useful for the second line to include the nodes, e.g. chomp BBBBB1 bezil BBBBB2 tardy BBBBB3 fusks BGBBG4 jujus BGBBG5 vuggs GGGGG6 chomp bezil tardy fusks jujus GGGGG5

It's not so much of an issue at line 2, but some sub-trees can be hundreds or even thousands of lines deep and to find the second word that got you there is many pages earlier!

Edit: I solved the second question myself by carefully editing the result letters in wordle.cpp – specifically I’m interested in sorting G, then Y, then B so the simplest option was to change B to Z. (My second question is if you can alter the alphabetisation of the results (by changing the result letters, e.g. sort B after G and Y) or use different letters for the results? I see there’s a bit of the code called humanorder which assigns an ordering to all 243 result codes, and am wondering if there's an elegant way to alter the order it generates without doing it by brute force.)

alex1770 commented 2 years ago

Dear xanthe-cat,

I agree with your first change - filling in the blanks in the tree output - I was meaning to make it anyway. I'll do that shortly.

Re the output order B/G/Y - I can see why you might want the more definite ones (G, Y) first, but the reason I used B first was that I thought it's easy when you get a lot of greens in response to your guess; the interesting case is when you get lots of Bs. Maybe best to make this a command line option to cater for different preferences?

xanthe-cat commented 2 years ago

Dear Alex,

Thanks for such a swift response.

My second question was somewhat moot anyway (as I simply substituted letters, then did a resubstitution on the output). I agree totally with your reasoning, however one of my uses for the trees is didactic, to demonstrate the use of a decision tree, and to my mind the sort order with B first is not quite as clear as to how the tree is sorting through the words (especially for an audience of non-programmers or non-mathematicians). I tweeted an example here just now: https://twitter.com/Xanthe_Cat/status/1503513597385015301

And I should have added, congratulations on finding two hard mode, depth-6 trees for the complete word set. Those must have taken a large amount of computation? Laurent Poirrier some time ago said he had a depth-13 tree and has probably done better since, but are you aware of any others investigating hard mode, ~12,900 hidden words?

alex1770 commented 2 years ago

I've added an option -S that allows you to choose the format of the decision tree. You can choose "filled" or "hollow", and you can also choose the sort order of the colours. For example -S fGYB would make a filled-in tree with the more certain colours coming before the less certain ones (G then Y then B).

alex1770 commented 2 years ago

As for the complete word set in hard mode, they require 7 guesses (I call this depth 7, though I've just remembered that some people refer to this as depth 6, which I guess is the notation you are using), and 7 is actually the best possible. I've been meaning to write this up (along with some other things) - will hopefully appear soon.

xanthe-cat commented 2 years ago

Haha, yes, there seems to be two conventions, one of which is that a tree with root at first level and furthest leaves at level n, is a tree of (n–1) branches/depth. You’re right, I was meaning a tree of depth 6 requires 7 guesses total. And that is pretty amazing given the full wordlist has some gnarly words to reach in hard mode, like the 19 _ills words, or the subsets of 15 words ending in -acks, -angs, -ests, -ight, and -ines!

Thanks again so much for this tool, it is really excellent.

PS Heads up, I saw just now the NYT have changed their wordlists again. The hidden list is still 2,309, it has only been permuted. The guesses list has had the 25 words restored, so now stands at 10,663; combined, 12,972 words and exact parity with the previous CSW19 subset of five-letter words.

alex1770 / wordle

Customising -p tree output #3