This is really cool! - Githubissues

mentalisttraceur commented 3 years ago

I really like the thought process behind this. Nice idea.

mentalisttraceur commented 3 years ago

I thought of something similar before, because I also faced the problem of "how do I make it maximally easy for myself to manually transcribe arbitrary data", so when I saw this I thought it was really cool that someone else had gone through the same process!

I never actually coded my version, but in my idea I was using the wordlist used by the pwqgen utility (from Openwall's passwdqc). I don't know if you considered that wordlist, but I think you might find that word list more useful for your goals, because it has a few useful properties:

All the words are 3-6 characters, so they are quick and easy to type even without autocomplete.
The words are relatively distinct from each other, so it should still be easy to not lose your place.
There are 4096 words, so you can encode 3 bytes into two words.

In my head I was initially calling the idea "base4096 encoding", just like you called yours "base256". Then to emphasize the human-friendliness and deemphasize any implication of space-efficiency, I decided to call it "human4096 encoding". Then I realized that was pretty English-centric of me, so I'm tempted to call it "english4096". That's nice because it also leaves room for many other "{{ human language name }}4096" encodings. Or maybe some other number than 4096. But 4096 seems like a good sweet spot for distinct words that humans can keep track of which also track byte boundaries with a reasonably regular period.

Biggest differences from your implementation:

Biggest relative advantage of the english4096 encoding is that the encodings tend to be much smaller.
english4096 is worse in cases where the auto-complete really dominates, like thumb input through a touchscreen keyboard on a phone.
english4096 is better when using a hardware keyboard with physical buttons, or when auto-complete-like assistance isn't available.
Biggest relative disadvantage of the english4096 encoding is the implementation complexity: the lack of 1-1 mapping to bytes.

(12-bit steps over 8-bit bytes is regular enough that we can still do a simple loop which reads three bytes at a time and has a special case for the two possible short reads. But it's still more code and bit-twiddling than just a simple table lookup.

Also, english4096 needs one extra word to handle those two cases in the decoder - if this modifier word is present, the word it modifies only encodes the bits of the first of the two bytes it would normally encode bits of. My intuition is to put it in front of the last word rather than behind, to make the logic simpler in the decoder, since that way the decoder doesn't have to "hold back" the bytes for a word until after checking if the next word is a modifier. You could use a modifier character or capitalization to signal this instead of an extra word, but I think capitalization is worse for human-friendliness because it complicates the mental model from the stupidly simple "just type the words you see" with extra details like "you need to be mindful of capitalization/symbols". To someone with attention to detail and computer savvy, the idea of replicating text exactly is automatic, but to a lot of lay people it is not necessarily intuitive that special characters or capitalization can't be just ignored when transcribing.)

Anyway, you've inspired me to actually implement and document english4096 encoding! Cheers!

ctsrc commented 1 year ago

Thank you for the kind words and for the details about your encoding.

I will keep this ticket open, as I might decide to add an implementation of your english4096 idea to my utility as well.

mentalisttraceur commented 1 year ago

Love to hear that! Would be super cool to have multiple compatible implementations!

I'll try to finally publish a precise spec and some test case examples later this week, then.

mentalisttraceur commented 1 year ago

Okay, here's a fuller, precise spec:

If you need a name string for the format, use english4096 (all lowercase if possible).
The word list is the 4096 words from OpenWall's passwdqc's pwqgen's wordlist, plus "zygote".

[click to expand full word list]
``` aback abbey abbot abide ablaze able aboard abode abort abound about above abroad abrupt absent absorb absurd abuse accent accept access accord accuse ace ache aching acid acidic acorn acre across act action active actor actual acute adam adapt add added addict adept adhere adjust admire admit adobe adopt adrift adult adverb advert aerial afar affair affect afford afghan afield afloat afraid afresh after again age agency agenda agent aghast agile ago agony agree agreed ahead aid aide aim air airman airy akin alarm alaska albeit album alert alibi alice alien alight align alike alive alkali all allah alley allied allow alloy ally almond almost aloft alone along aloof aloud alpha alpine also altar alter always amaze amazon amber ambush amen amend amid amidst amiss among amount ample amuse anchor and andrew anew angel anger angle anglo angola angry animal ankle annoy annual answer anthem anti antony any anyhow anyway apart apathy apex apiece appeal appear apple apply april apron arab arcade arcane arch arctic ardent are area argue arid arise arm armful armpit army aroma around arouse array arrest arrive arrow arson art artery artful artist ascent ashen ashore aside ask asleep aspect assay assent assert assess asset assign assist assume assure asthma astute asylum ate athens atlas atom atomic attach attack attain attend attic auburn audio audit august aunt auntie aura austin author auto autumn avail avenge avenue avert avid avoid await awake awaken award aware awash away awful awhile axes axiom axis axle aye babe baby bach back backup bacon bad badge badly bag baggy bail bait bake baker bakery bald ball ballad ballet ballot baltic bamboo ban banal banana band bang bank bar barber bare barely barge bark barley barn baron barrel barren basalt base basic basil basin basis basket basque bass bat batch bath baton battle bay beach beacon beak beam bean bear beard beast beat beauty become bed beech beef beefy beep beer beet beetle before beggar begin behalf behave behind beige being belief bell belly belong below belt bench bend benign bent berlin berry berth beset beside best bestow bet beta betray better beware beyond bias bible biceps bicker bid big bigger bike bile bill binary bind biopsy birch bird birdie birth bishop bit bitch bite bitter black blade blame bland blast blaze bleak blend bless blew blind blink blip bliss blitz block blond blood bloody bloom blot blouse blow blue bluff blunt blur blush boar board boast boat bodily body bogus boil bold bolt bomb bombay bond bone bonn bonnet bonus bony book boom boost boot booth booze border bore borrow bosom boss boston both bother bottle bottom bought bounce bound bounty bout bovine bow bowel bowl box boy boyish brace brain brainy brake bran branch brand brandy brass brave bravo brazil breach bread break breast breath bred breed breeze brew brick bride bridge brief bright brim brine bring brink brisk briton broad broke broken bronze brook broom brown bruise brush brutal brute bubble buck bucket buckle buddha budget buffet buggy build bulb bulge bulk bulky bull bullet bully bump bumpy bunch bundle bunk bunny burden bureau burial buried burly burma burn burnt burrow burst bury bus bush bust bustle busy but butler butt butter button buy buyer buzz bye byte cab cabin cable cache cactus caesar cage cairo cake calf call caller calm calmly came camel camera camp campus can canada canal canary cancel cancer candid candle candy cane canine canoe canopy canvas canyon cap cape car carbon card care career caress cargo carl carnal carol carp carpet carrot carry cart cartel case cash cask cast castle casual cat catch cater cattle caught causal cause cave cease celery cell cellar celtic cement censor census cereal cervix chain chair chalk chalky champ chance change chant chaos chap chapel charge charm chart chase chat cheap cheat check cheek cheeky cheer cheery cheese chef cherry chess chest chew chic chick chief child chile chill chilly chin china chip choice choir choose chop choppy chord chorus chose chosen christ chrome chunk chunky church cider cigar cinema circa circle circus cite city civic civil clad claim clammy clan clap clash clasp class clause claw clay clean clear clergy clerk clever click client cliff climax climb clinch cling clinic clip cloak clock clone close closer closet cloth cloud cloudy clout clown club clue clumsy clung clutch coach coal coarse coast coat coax cobalt cobra coca cock cocoa code coffee coffin cohort coil coin coke cold collar colon colony colt column comb combat come comedy comic commit common compel comply concur cone confer congo consul convex convey convoy cook cool cope copper copy coral cord core cork corn corner corps corpse corpus cortex cosmic cosmos cost costly cosy cotton couch cough could count county coup couple coupon course court cousin cove cover covert cow coward cowboy crab crack cradle craft crafty crag crane crap crash crate crater crawl crazy creak cream creamy create credit creed creek creep creepy crept crest crew cried crime crisis crisp critic croft crook crop cross crow crowd crown crude cruel cruise crunch crush crust crux cry crypt cuba cube cubic cuckoo cuff cult cup curb cure curfew curl curry curse cursor curve custom cut cute cycle cyclic cynic cyprus czech dad daddy dagger daily dairy daisy dale dallas damage damn damp dampen dance danger danish dare dark darken darwin dash data date david dawn day dead deadly deaf deal dealer dean dear death debate debit debris debt debtor decade decay decent decide deck decor decree deduce deed deep deeply deer defeat defect defend defer define defy degree deity delay delete delhi delta demand demise demo demon demure denial denote dense dental deny depart depend depict deploy depot depth deputy derby derive desert design desire desist desk detail detect deter detest detour device devil devise devoid devote devour dial diana diary dice dictum did die diesel diet differ digest digit dine dinghy dinner diode dire direct dirt dirty disc disco dish disk dismal dispel ditch dive divert divide divine dizzy docile dock doctor dog dogma dole doll dollar dolly domain dome domino donate done donkey donor doom door dorsal dose double doubt dough dour dove down dozen draft drag dragon drain drama drank draw drawer dread dream dreary dress drew dried drift drill drink drip drive driver drop drove drown drug drum drunk dry dual dublin duck duct due duel duet duke dull duly dumb dummy dump dune dung duress during dusk dust dusty dutch duty dwarf dwell dyer dying dynamo each eager eagle ear earl early earn earth ease easel easily east easter easy eat eaten eater echo eddy eden edge edible edict edit editor edward eerie eerily effect effort egg ego eight eighth eighty either elbow elder eldest elect eleven elicit elite else elude elves embark emblem embryo emerge emit empire employ empty enable enamel end endure enemy energy engage engine enjoy enlist enough ensure entail enter entire entry envoy envy enzyme epic epoch equal equate equip equity era erase erect eric erode erotic errant error escape escort essay essex estate esteem ethic ethnic europe evade eve even event ever every evict evil evoke evolve exact exam exceed excel except excess excise excite excuse exempt exert exile exist exit exodus exotic expand expect expert expire export expose extend extra eye eyed fabric face facial fact factor fade fail faint fair fairly fairy faith fake falcon fall false falter fame family famine famous fan fancy far farce fare farm farmer fast fasten faster fat fatal fate father fatty fault faulty fauna fear feast feat fed fee feeble feed feel feet fell fellow felt female fence fend ferry fetal fetch feudal fever few fewer fiance fiasco fiddle field fiend fierce fiery fifth fifty fig fight figure file fill filled filler film filter filth filthy final finale find fine finger finish finite fire firm firmly first fiscal fish fisher fist fit fitful five fix flag flair flak flame flank flap flare flash flask flat flaw fled flee fleece fleet flesh fleshy flew flick flight flimsy flint flirt float flock flood floor floppy flora floral flour flow flower fluent fluffy fluid flung flurry flush flute flux fly flyer foal foam focal focus fog foil fold folk follow folly fond fondly font food fool foot for forbid force ford forest forge forget fork form formal format former fort forth forty forum fossil foster foul found four fourth fox foyer frail frame franc france frank fraud free freed freely freer freeze french frenzy fresh friar friday fridge fried friend fright fringe frock frog from front frost frosty frown frozen frugal fruit fudge fuel fulfil full fully fun fund funny fur furry fury fuse fusion fuss fussy futile future fuzzy gadget gag gain gala galaxy gale gall galley gallon gallop gamble game gamma gandhi gang gap garage garden garlic gas gasp gate gather gauge gaul gaunt gave gay gaze gear geese gemini gender gene geneva genial genius genre gentle gently gentry genus george german get ghetto ghost giant gift giggle gill gilt ginger girl give given glad glade glance gland glare glass glassy gleam glee glide global globe gloom gloomy gloria glory gloss glossy glove glow glue goal goat god gold golden golf gone gong good goose gorge gory gosh gospel gossip got gothic govern gown grab grace grade grain grand grant grape graph grasp grass grassy grate grave gravel gravy gray grease greasy great greece greed greedy greek green greet grew grey grid grief grill grim grin grind grip grit gritty groan groin groom groove gross ground group grove grow grown growth grudge grunt guard guess guest guide guild guilt guilty guise guitar gulf gully gun gunman guru gut guy gypsy habit hack had hague hail hair hairy haiti hale half hall halt hamlet hammer hand handle handy hang hangar hanoi happen happy harass hard harder hardly hare harem harm harp harry harsh has hash hassle haste hasten hasty hat hatch hate haul haunt havana have haven havoc hawaii hawk hazard haze hazel hazy head heal health heap hear heard heart hearth hearty heat heater heaven heavy hebrew heck hectic hedge heel hefty height heir held helium helix hell hello helm helmet help hemp hence henry her herald herb herd here hereby hermes hernia hero heroic heroin hey heyday hick hidden hide high higher highly hill him hind hindu hint hippy hire his hiss hit hitler hive hoard hoarse hobby hockey hold holder hole hollow holly holy home honest honey hood hook hope horn horny horrid horror horse hose host hot hotel hound hour house hover how huge hull human humane humble humid hung hunger hungry hunt hurdle hurl hurry hurt hush hut hybrid hymn hyphen ice icing icon idaho idea ideal idiom idiot idle idly idol ignite ignore ill image immune impact imply import impose inca incest inch income incur indeed index india indian indoor induce inept inert infant infect infer influx inform inject injure injury inlaid inland inlet inmate inn innate inner input insane insect insert inset inside insist insult insure intact intake intend inter into invade invent invest invite invoke inward iowa iran iraq irish iron ironic irony isaac isabel islam island isle israel issue italy itch item itself ivan ivory jack jacket jacob jade jaguar jail james japan jargon java jaw jazz jeep jelly jerky jersey jest jesus jet jewel jewish jim job jock jockey john join joint joke jolly jolt jordan joseph joy joyful joyous judas judge judy juice juicy july jumble jumbo jump june jungle junior junk junta jury just kansas karate karl keel keen keep keeper kenya kept kernel kettle key khaki kick kid kidnap kidney kill killer kin kind kindly king kiss kite kitten knack knee knew knife knight knit knob knock knot know known koran korea kuwait label lace lack lad ladder laden lady lagoon laity lake lamb lame lamp lance land lane laos lap lapse large larval laser last latch late lately latent later latest latin latter laugh launch lava lavish law lawful lawn lawyer lay layer layman lazy lead leader leaf leafy league leak leaky lean leap learn lease leash least leave led ledge left leg legacy legal legend legion lemon lend length lens lent leo leper lesion less lessen lesser lesson lest let lethal letter level lever levy lewis liable liar libel libya lice lick lid lie lied life lift light like likely lima limb lime limit limp line linear linen linger link lion lip liquid liquor lisbon list listen lit live lively liver liz lizard load loaf loan lobby lobe local locate lock locus lodge loft lofty log logic logo london lone lonely long longer look loop loose loosen loot lord lorry lose loss lost lot lotion lotus loud loudly lounge lousy louvre love lovely lover low lower lowest loyal lucid luck lucky lucy lull lump lumpy lunacy lunar lunch lung lure lurid lush lust lute luther luxury lying lymph lynch lyric macho macro mad madam madame made madrid mafia magic magma magnet magnum maid maiden mail main mainly major make maker male malice mall malt malta mammal manage mane mania manic manner manor mantle manual manure many map maple marble march mare margin maria marina mark market marry mars marsh martin martyr mary mask mason mass mast master match mate matrix matter mature maxim may maya maybe mayor maze mead meadow meal mean meant meat mecca medal media median medic medium meet mellow melody melon melt member memo memory menace mend mental mentor menu mercy mere merely merge merger merit merry mesh mess messy met metal meter method methyl metric metro mexico miami mickey mid midday middle midst midway might mighty milan mild mildew mile milk milky mill mimic mince mind mine mini mink minor mint minus minute mirror mirth misery miss mist misty mite mix moan moat mobile mock mode model modem modern modest modify module moist molar mole molten moment monaco monday money monies monk monkey month mood moody moon moor moral morale morbid more morgue mortal mortar mosaic moscow moses moslem mosque moss most mostly moth mother motion motive motor mould mount mourn mouse mouth move movie mrs much muck mucus mud muddle muddy mule mummy munich murder murky murmur muscle museum music muslim mussel must mutant mute mutiny mutter mutton mutual muzzle myopic myriad myself mystic myth nadir nail naked name namely nape napkin naples narrow nasal nasty nation native nature nausea naval nave navy nazi near nearer nearly neat neatly neck need needle needy negate neon nepal nephew nerve nest neural never newark newly next nice nicely niche nickel niece night nile nimble nine ninety ninth nobel noble nobody node noise noisy non none noon nor norm normal north norway nose nosy not note notice notify notion nought noun novel novice now nozzle nude null numb number nurse nylon nymph oak oasis oath obese obey object oblige oboe obtain occult occupy occur ocean octave odd off offend offer office offset often ohio oil oily okay old older oldest olive omega omen omit once one onion only onset onto onus onward opaque open openly opera opium oppose optic option oracle oral orange orbit orchid ordeal order organ orgasm orient origin ornate orphan oscar oslo other otter ought ounce our out outer output outset oval oven over overt owe owing owl own owner oxford oxide oxygen oyster ozone pace pack packet pact paddle paddy pagan page paid pain paint pair palace pale palm panama panel panic papa papal paper parade parcel pardon parent paris parish park parody parrot part partly party pascal pass past paste pastel pastor pastry pat patch patent path patio patrol patron paul pause pave pawn pay peace peach peak pear pearl pedal peel peer peking pelvic pelvis pen penal pence pencil penis penny people pepper per perch peril period perish permit person peru pest peter petite petrol petty phase philip phone photo phrase piano pick picket picnic pie piece pier pierce piety pig pigeon piggy pike pile pill pillar pillow pilot pin pinch pine pink pint pious pipe pirate piss pistol piston pit pitch pity pivot pixel pizza place placid plague plain plan plane planet plank plant plasma plate play player plea plead please pledge plenty plenum plight plot ploy plug plum plump plunge plural plus plush pocket poem poet poetic poetry point poison poland polar pole police policy polish polite poll pollen polo pond ponder pony pool poor poorly pop pope poppy pore pork port portal pose posh post postal pot potato potent pouch pound pour powder power prague praise pray prayer preach prefer prefix press pretty price pride priest primal prime prince print prior prism prison privy prize probe profit prompt prone proof propel proper prose proton proud prove proven proxy prune psalm pseudo psyche pub public puff pull pulp pulpit pulsar pulse pump punch punish punk pupil puppet puppy pure purely purge purify purple purse pursue push pushy pussy put putt puzzle quaint quake quarry quartz quay quebec queen queer query quest queue quick quid quiet quilt quirk quit quite quiver quiz quota quote rabbit race racial racism rack racket radar radio radish radius raffle raft rage raid rail rain rainy raise rally ramp random range rank ransom rape rapid rare rarely rarity rash rat rate rather ratify ratio rattle rave raven raw ray razor reach react read reader ready real really realm reap rear reason rebel recall recent recess recipe reckon record recoup rector red redeem reduce reed reef refer reform refuge refuse regal regard regent regime region regret reign reject relate relax relay relic relief relish rely remain remark remedy remind remit remote remove renal render rent rental repair repeal repeat repent reply report rescue resent reside resign resin resist resort rest result resume retail retain retina retire return reveal review revise revive revolt reward rex rhine rhino rhyme rhythm ribbon rice rich rick rid ride rider ridge rife rifle rift right rigid ring rinse riot ripe ripen ripple rise risk risky rite ritual ritz rival river road roar roast rob robe robert robin robot robust rock rocket rocky rod rode rodent rogue role roll roman rome roof room root rope rosa rose rosy rotate rotor rotten rouge rough round route rover row royal rubble ruby rudder rude rugby ruin rule ruler rumble rump run rune rung runway rural rush russia rust rustic rusty sack sacred sad saddle sadism sadly safari safe safely safer safety saga sage sahara said sail sailor saint sake salad salary sale saline saliva salmon saloon salt salty salute sam same sample sand sandy sane sash satan satin satire saturn sauce saudi sauna savage save saxon say scale scalp scan scant scar scarce scare scarf scary scene scenic scent school scope score scorn scot scotch scout scrap scream screen screw script scroll scrub scum sea seal seam seaman search season seat second secret sect sector secure see seed seeing seek seem seize seldom select self sell seller semi senate send senile senior sense sensor sent sentry seoul sequel serene serial series sermon serum serve server set settle seven severe sewage sex sexual sexy shabby shade shadow shady shaft shaggy shah shake shaky shall sham shame shape share shark sharp shawl she shear sheen sheep sheer sheet shelf shell sherry shield shift shine shiny ship shire shirt shit shiver shock shoe shook shoot shop shore short shot should shout show shower shrank shrewd shrill shrimp shrine shrink shrub shrug shut shy shyly sick side siege sigh sight sigma sign signal silent silk silken silky sill silly silver simple simply since sinful sing singer single sink sir siren sister sit site six sixth sixty size sketch skill skin skinny skip skirt skull sky slab slack slain slam slang slap slate slater slave sleek sleep sleepy sleeve slice slick slid slide slight slim slimy sling slip slit slogan slope sloppy slot slow slowly slug slum slump smack small smart smash smear smell smelly smelt smile smoke smoky smooth smug snack snail snake snap snatch sneak snow snowy snug soak soap sober soccer social sock socket soda sodden sodium sofa soft soften softly soggy soil solar sold sole solely solemn solid solo solve somali some son sonar sonata song sonic sony soon sooner soot soothe sordid sore sorrow sorry sort soul sound soup sour source soviet space spade spain span spare spark sparse spasm spat spate speak spear speech speed speedy spell spend sperm sphere spice spicy spider spiky spill spin spinal spine spiral spirit spit spite splash split spoil spoke sponge spoon sport spot spouse spray spread spree spring sprint spur squad square squash squat squid stab stable stack staff stage stain stair stake stale stalin stall stamp stance stand staple star starch stare stark start starve state static statue status stay stead steady steak steal steam steel steep steer stem stench step stereo stern stew stick sticky stiff stifle stigma still sting stint stir stitch stock stocky stone stony stool stop store storm stormy story stout stove strain strait strand strap strata straw stray streak stream street stress strict stride strife strike string strip strive stroke stroll strong stud studio study stuff stuffy stunt stupid sturdy style submit subtle subtly suburb such suck sudan sudden sue suez suffer sugar suit suite suitor sullen sultan sum summer summit summon sun sunday sunny sunset super superb supper supple supply sure surely surf surge survey suture swamp swan swap swarm sway swear sweat sweaty sweden sweep sweet swell swift swim swine swing swirl swiss switch sword swore sydney symbol synod syntax syria syrup system table tablet taboo tacit tackle tact tactic tail tailor taiwan take tale talent talk tall tally tame tandem tangle tank tap tape target tariff tart tarzan task taste tasty tattoo taurus taut tavern tax taxi tea teach teak team tear tease tech teeth tehran tell temper temple tempo tempt ten tenant tend tender tendon tennis tenor tense tensor tent tenth tenure teresa term terror test texas text than thank that the their them theme then thence theory there these thesis they thick thief thigh thin thing think third thirst thirty this thomas thorn those though thread threat three thrill thrive throat throne throng throw thrust thud thug thumb thus thyme tibet tick ticket tidal tide tidy tie tier tiger tight tile till tilt timber time timid tin tiny tip tissue title toad toast today toilet token tokyo told toll tom tomato tomb tonal tone tongue tonic too took tool tooth top topaz topic torch torque torso tort toss total touch tough tour toward towel tower town toxic toxin trace track tract trade tragic trail train trait tram trance trap trauma travel tray tread treat treaty treble tree trek tremor trench trend trendy trial tribal tribe trick tricky tried trifle trim trio trip triple troop trophy trot trough trout truce truck true truly trunk trust truth try tsar tube tumble tuna tundra tune tung tunic tunnel turban turf turk turkey turn turtle tutor tweed twelve twenty twice twin twist two tycoon tying type tyrant uganda ugly ulcer ultra umpire unable uncle under uneasy unfair unify union unique unit unite unity unlike unrest unruly until update upheld uphill uphold upon uproar upset upshot uptake upturn upward urban urge urgent urging urine usable usage use useful user usual uterus utmost utter vacant vacuum vagina vague vain valet valid valley value valve van vanish vanity vary vase vast vat vault vector veil vein velvet vendor veneer venice venom vent venue venus verb verbal verge verify verity verse versus very vessel vest veto via viable vicar vice victim victor video vienna view vigil viking vile villa vine vinyl viola violet violin viral virgin virgo virtue virus visa vision visit visual vital vivid vocal vodka vogue voice void volley volume vomit vote vowel voyage vulgar wade wage waist wait waiter wake walk walker wall wallet walnut wander want war warden warm warmth warn warp warsaw wary was wash wasp waste watch water watery wave way weak weaken wealth weapon wear weary wedge wee weed week weekly weep weight weird well were west wet whale wharf what wheat wheel when whence where which whiff whig while whim whip whisky white who whole wholly whom whore whose why wide widely widen wider widow width wife wild wildly wilful will willow win wind window windy wine wing wink winner winter wipe wire wisdom wise wish wit witch with within witty wizard woke wolf wolves woman womb won wonder wood wooden woods woody wool word work worker world worm worry worse worst worth worthy would wound wrap wrath wreath wreck wright wrist writ write writer wrong xerox yacht yale yard yarn yeah year yeast yellow yemen yet yield yogurt yolk york you young your youth zaire zeal zebra zenith zero zigzag zinc zombie zone zurich zygote ```
The first 4096 words encode bit patterns from 0x000 to 0xFFF (so "aback" is 0x000, "abbey" is 0x001, ..., "will" is 0xFAB, "willow" is 0xFAC, and so on).
The canonical word list order is alphabetical. When one word is an exact prefix of another (for example "will" and "willow"), the shorter one sorts "first" (closer to A, further from Z).
Encoding operates on chunks of 24 bits: three octets of input (8 bits x 3), output two words (12 bits x 2).

For example, if the first three bytes of the input are 0x01 0x23 0x45, that's 0x012345, which re-splits as 0x012 0x345, so the output is accent cruise.
The last/4097th word, "zygote", prevents ambiguous outputs by indicating that the next word only contributes to one byte.

For example,
- 0x12 0x34 0x00 re-splits as 0x123 0x400, encodes to basin drag.
- 0x12 0x34 re-splits as 0x123 0x4__, encodes to basin zygote drag.
- 0x12 0x00 re-splits as 0x120 0x0__, encodes to base zygote aback.
- 0x12 re-splits as 0x12_ 0x___, encodes to zygote base.
Implementations should accept redundant encodings where the word "zygote" occurs in the middle of the word string, or multiple times, so long as each occurrence is individually valid.

For example, zygote base zygote crow is the catenation of separately encoding 0x12 and 0x34, and that whole string should decode to 0x12 0x34. Similarly, basin zygote drag forum mast should decode to 0x12 0x34 0x56 0x78 0x90.
Implementations should reject unused encodings where the ignored bits are non-zero (basin zygote dragon, basin zygote fig, zygote basic, zygote beacon, ...) and where the word zygote appears as the last word.
Decoding should be case-insensitive.

[Click to expand for rationale on the above points]

1. Why "zygote" for the extra word? sorts last; distinct spelling+sound, even with mistakes; keeps all words within 3-6 letters; and the idea of being undeveloped is mnemonic with not completely using an encoded word. 2. [nothing to explain] 3. Why alphabetical? human-friendly, obvious even without explanation, and enables incremental binary search on the word list when decoding. Similarly, shorter before longer enables another decoding optimization: `\0` < `a` < `b` < ... < `y` < `z` in all encodings, so if you append nulls to the shorter words, you can just lexically comparing letters without code+branches checking string sizes. 4. [nothing to explain] 5. Why put "zygote" before the partially-used word rather than after? it makes it approximately impossible to parse incomplete or delayed input as a different valid input. 6. Why allow these inefficient redundant encodings? 1. It's the cleanest way to prevent a surprising inconsistency: if we don't allow it, it is inescapable that outputs without `zygote` *can* be prepended validly to other outputs, but ones with `zygote` can't - and users might not notice until they're already relying on it working. 2. When two operations are commutative, when all valid outputs can be catenated and then decode to the catenation of their inputs, it gives more flexibility and power to users in a fundamental permeating way that often turns out useful in hard-to-predict ways. 3. If such redundant encodings are allowed, "normalizing" them is a trivial decode and re-encode operation - if they're not accepted but the use-case turns up, a user has to go out of their way, possibly reimplementing or mentally stepping through the decoding and encoding just to give it to the decoder that rejects this. 7. Why reject zygote'd words that imply non-zero ignored bits? Because unlike the redundant encodings in the last point, this doesn't have a natural use case outweighing the default best practice, which is that: rejecting helps help catch errors instead of turning errors into silent misbehavior, and protects the ecosystem by nipping buggy implementations and mutually incompatible extensions in the bud. 8. It would be a nuisance for users if "wrong" casing was rejected, could easily become a confusing and non-obvious source of errors if case was overloaded to have significant meaning, and would get in the way of many possible useful uses of case - such as for reducing visual ambiguity with some fonts, to help users not lose their place in a long encoded word string, and so on.)

ctsrc / Base256

This is really cool! #1