paulfitz / cosmicos

Sending the lambda calculus into deep space
https://cosmicos.github.io/
GNU General Public License v2.0
134 stars 8 forks source link

Concerning the "A question of character" posting on your site - prefix codes? #30

Open joha2 opened 3 years ago

joha2 commented 3 years ago

Hey @paulfitz and @aw1231 long time no hear! I hope you're doing well!

In the wiki I already asked this question, but to my knowledge we never discussed this:

Is it possible to use prefix codes for the "words" introduced by intro?

https://www.cs.princeton.edu/courses/archive/spr01/cs126/assignments/prefix.html

https://en.wikipedia.org/wiki/Prefix_code

This should simplify the decoding, since no "word" is prefix of another "word". I would also suggest to use this for the binary numbers, such that they also get unique codes. At the end the recipient does not care whether in the message the binary one is writen as 1 or another combination. Or one could start the tree for the prefix with 0 and 1 (but this does not mean 1 will stay 1). See the tree in the attached image, where 0 would be 0, but 1 would be 10 and unary would be 110 (depending whether you go left or right seen from the root node). At the end this would mitigate issues like + being 10 something which is not searchable in the message. OK, at the end the shorter codes can appear within the longer ones, so maybe a simple search is not useful. But if you get the bytestream byte by byte every command is unique :smile: Is this an option?

Best wishes Johannes

tree_proposal

alanfwilliams commented 3 years ago

Hi, been a while! Coming at this from a non-CS perspective, basically what you're proposing is to take each number or vocab word and add it to a tree? Sound great for what we have now. Though, supposing this project takes off, managing the tree would be a task on its own, unless one could automate adding to the tree each time it finds a new word on recompiling which i'm sure is doable, I just don't have the time. The search advantages would make it useful. Would making this into another variant perhaps be the best way to implement this? It keeps all the source material the same, but the new variant would be searchable. Just a thought.

joha2 commented 2 years ago

Hey @aw1231 sorry for the neglection of the cosmicos from my side. Yeah definitely this tree structure is something for another variant. I do not know how complicated the implementation is, but maybe one needs at least a two stage compilation. But maybe this could mitigate the issue

After that something strange appeared. It looked like symbols were equated to numbers. I've somewhat quickly realized after reverting back to 2xxxx representation that that was actually equating numbers to their binary representation. A bit further there was even an arithmetic on binary numbers and that was confusing as some of the numbers looked like they were actual binary numbers and some of the numbers were symbols.

in #31 mentioned by @pavgran.