Opcode mnemonics for SubX

akkartik commented 4 years ago

Giving mnemonics to x86 opcodes might make SubX easier for newcomers to read. This issue is intended to track the pros and cons.

Previous discussion: https://news.ycombinator.com/item?id=21268252#21293301

Initial list of pros and cons:

Pro: names give some indication (albeit imprecise) of the operation being performed.
Pro: names can be checked by tooling. Typoing 8d for 8f seems easier than typoing copy for pop.
Con: per-opcode mnemonics are an additional set of things for the new reader to understand and navigate. And cross-correlate with the Intel manual, since it only gives names to sets of opcodes.
Con: tooling may have a hard time giving a good error message when a user typos one variant of say add for another.

akkartik commented 4 years ago

Con: tooling will take code to implement, code that will add to the complexity of the codebase readers have to navigate.

e12e commented 4 years ago

Would mnemonics (for individual concrete om op codes, eg add_immediate) via a basic table lookup be in any significant way more complex than the literal hex numbers?

akkartik commented 4 years ago

Depends on how you think of complexity.

For code complexity, there's a little extra complexity to looking up a table rather than just parse an integer. It's not much, but the self-hosting translator has to support it with raw machine code. SubX doesn't have hash tables yet. Our 'tables' at the moment are just lists of key-value pairs that are checked linearly.
For cognitive complexity, you end up needing to look up various details every so often from the Intel manual. That requires opcodes. So you can't really program SubX while remaining oblivious of opcodes entirely. Now you have to juggle two sets of names for the same things.

Neither of these is a deal-breaker. I think there may be value in names that are conveniences but don't try to perfectly hide opcodes. And linear scan is fine when a table is just ~100 entries long. The bulk of time will be spent on string comparisons anyway.

Would you be interested in coming up with names for all the mnemonics? The complete list is in the opcodes file in this repo, and can also be obtained by running subx help opcodes. Like I suggested on HN, I think the next step would be to start with some simple example (say apps/factorial.subx and try out names by hand.

e12e commented 4 years ago

I'll try and find time to have a go - maybe as you suggest by reworking what's needed for a small example first, to get a feel for the different syntaxes.

Surely without mnemonics a user (programmer) will still need to consult a table for help looking up the code for the desired op?

akkartik commented 4 years ago

Very happy to hear it. Looking forward to seeing what you come up with.

While reading code, an unfamiliar programmer will hopefully see all the existing code formatted and documented in a consistent manner so no table lookups are needed.

While writing code, I think an unfamiliar programmer will have to consult a table anyway. I'd like for it to be a single table, and not two tables. Or more precisely, I'd like them to have to perform only a single mapping in their minds, rather than two. I'm starting to think it can be done with mnemonics.

My first priority is the unfamiliar programmer who is new to this codebase. So I'm very conscious that my choice of opcodes may seem easier to me than to others. That makes your feedback and this ticket very valuable.

By the same token, once you or someone designs a mnemonic system, it may seem easier to you than to others (since you created it). As a first cut, it'll be interesting to see how I react to it (which is partly why I'd like for someone else to take the lead in designing it). But it'll also be interesting to see what questions others have over time.

Incidentally, I'd recommend sticking to a single name for each opcode for now, rather than trying to add syntax like add|immediate. Let's start with the least structure possible and see how far it gets us.

akkartik / mu

Opcode mnemonics for SubX #39