asvd / microlight

highlights code in any programming language
http://asvd.github.io/microlight
MIT License
1.48k stars 63 forks source link

This is the coolest thing ever!!! #5

Closed bweir closed 8 years ago

bweir commented 8 years ago

Thank you for making this!

asvd commented 8 years ago

wow. thanks!

(sorry, I'm not gonna fix this issue)

Qix- commented 8 years ago
pachacamac commented 8 years ago

This is probably the best place to ask instead of opening a new issue. I must know what tool/algorithm you used to create that convoluted beast of a RegEx! Please don't say you did this by hand :D

asvd commented 8 years ago

@pachacamac yes, that was built by hand :-)

But that's not that complicated as you might assume. First I also started with googling for an algorithm which would do that for me, but then I realized that the principle is simple. Roughly speaking, it makes sence to "factor out" in case if there are four or more words starting with the same letter:

lambda|let|lock|long
l(ambda|et|ock|ong)    <-- one byte saved!

If we only have three or less words, there is no sence to factor out:

global|goto|guard
g(lobal|oto|uard)      // same length

do|double
d(o|ouble)             // even bigger

This was the most typical case for taking a decision. For others, in case of doubt I just compared the length.

pachacamac commented 8 years ago

Really wonder though if that could be done automatically. Huffman encoding comes to mind but right now I can't think of how to implement it nor if it's worth it, but its kind of interesting :D

EDIT: Almost. Not Huffman tree but https://en.wikipedia.org/wiki/Radix_tree

brasofilo commented 5 months ago

Keeps being cool! Thanks for not fixing it :)