thlorenz / visulator

A machine emulator that visualizes how each instruction is processed
https://thlorenz.github.io/visulator
GNU General Public License v3.0
388 stars 34 forks source link

visulator build status

A machine emulator that visualizes how each instruction is processed

Status

MAD SCIENCE

Play with it

API and Notes

Table of Contents generated with DocToc

cu::_byteRegPair

Used for byte sized operations on registers or a register pair.

In case a register pair is used the names of both registers are provided. In case only one register is used, the same code is used except we only use the first register of the pair.

Source:

cu::_dwordRegPair

Used for any operation that operates on a register pair mov, add, etc.

Same code used no matter of the pair size dword, word. For byte size general puropose regs we use @see _byteRegPair instead.

Operations for smaller pairs just have a different opcode than dword operations prefixing the pair.

Certain operations like add/sub only use first reg of the pair, addressing it via the pair code. In that case the operation may also be encoded in the reg pair code, i.e.

add ecx, ...  ; 83 c1 ... uses c1 to indicate ecx
sub ecx, ...  ; 83 e9 ... uses e9 to indicate ecx
cmp ecx, ...  ; 83 f9 ... uses f9 to indicate ecx
Source:

registers::_flagIndexes

Index of each flag in the eflags register.

Source:

registers::_flagMasks

Flags representation for each case of ONE flag set at a time. Used to isolate each flag for flag operations

Flag's Meanings

  • CF: carry flag set if the result of an add or shift operation carries out a bit beyond the destination operand; otherwise cleared
  • PF: parity flag set if the number of 1-bits in the low byte of the result is even, otherwise cleared
  • AF: adjust flag auxiliary carry used for 4-bit BCD math, set when an operation causes a carry out of a 4-bit BCD quantity
  • ZF: zero flag set if the result of an operation is zero, otherwise cleared
  • TF: trap flag for debuggers, permits operation of a processor in single-step mode
  • SF: sign flag set when the sign of the result forces the destination operand to become negative, i.e. its most significant bit is set
  • IF: interrupt enable flag determines whether or not the CPU will handle maskable hardware interrupts
  • DF: direction flag controls the left-to-right or right-to-left direction of string processing
  • OF: overflow flag set if the result is too large to fit in the destination operand

see: wiki flags register

Source:

auxiliary(dst, src) → {Boolean}

Determnies if a carry or borrow has been generated out of the least significant four bits when adding src to dst wiki

Parameters:
Name Type Description
dst Number

destination register

src Number

source register

Source:
Returns:

true if a half-carry occurs when adding src to dst, otherwise false

Type
Boolean

cu::_dec(opcode, asm, srcbytes, dstbytes)

Decrement a register

Parameters:
Name Type Description
opcode
asm
srcbytes
dstbytes
Source:

cu::_inc(opcode, asm, srcbytes, dstbytes)

Inccrement a register

Parameters:
Name Type Description
opcode
asm
srcbytes
dstbytes
Source:

cu::_movr(opcode, asm, srcbytes)

Moves one register into another. In order to execute this instruction we read the next code byte. It tells us which register pairs are affected (i.e. which register to move into which).

We look these up via a table.

Parameters:
Name Type Description
opcode Number
asm String
srcbytes Number

the size of the (sub)register to move

Source:

cu::next()

Fetches, decodes and executes next instruction and stores result.

This implementation totally ignores a few things about modern processors and instead uses a much simpler algorithm to fetch and execute instructions and store the results.

Here are some concepts that make modern processors faster, but are not employed here, followed by the simplified algorightm we actually use here.

Pipelining

  • instructions are processed in a pipe line fashion with about 5 stages, each happening in parallel for multiple instructions
    • load instruction
    • decode instruction
    • fetch data
    • execute instruction
    • write results for instruction
  • at a given time instruction A is loaded, B is decoded, data is fetched for C, D is executing and results for E are written
  • see: moores-law-in-it-architecture

Caches

  • modern processors have L1, L2 and L3 caches
  • L1 and L2 are on the processor while L3 is connected via a high speed bus
  • data that is used a lot and namely the stack and code about to be executed is usually found in one of these caches, saving a more expensive trip to main memory

Branch Prediction

  • in order to increase speed the processor tries to predict which branch of code is executed next in order to pre-fetch instructions
  • IA-64 replaces this by predication which even allows the processor to execute all possible branch paths in parallel

Translation to RISC like micro-instructions

  • starting with the Pentium Pro (P6) instructions are translated into RISC like micro-instructions
  • these micro-instructions are then executed (instead of the original ones) on a highly advanced core
  • see: pentium-pro-p6-6th-generation-x86-microarchitecture

Simplified Algorithm

  • 1) fetch next instruction (always from main memory -- caches don't exist here)
  • 2) decode instruction and decide what to do
  • 3) execute instruction, some directly here and others via the ALU
  • 4) store the result in registers/memory
  • 5) goto 1

  • and 2. basically become one step since we just call a function named after the opcode of the mnemonic.

We then fetch more bytes from the code in order to complete the instruction from memory (something that is inefficient and not done in the real world, where multiple instructions are pre-fetched instead).

The decoder is authored using this information.

Source:

cu:_push_reg(opcode)

Push 32 bit register onto stack. x50

50   push   eax
51   push   ecx
53   push   ebx
52   push   edx
54   push   esp
55   push   ebp
56   push   esi
57   push   edi
Parameters:
Name Type Description
opcode
Source:

hexstring(x)

Converts given number to a two digit hex str

Parameters:
Name Type Description
x Number

number between 0x00 and 0xff

Source:
Returns:

two digit string representation

leBytes(val, nbytes) → {Array.<Number>}

Antidote to leVal. Converts a value into a buffer of n bytes ordered little endian.

Parameters:
Name Type Argument Description
val Number

value 8, 16 or 32 bits

nbytes Number <optional>

number of bytes of the value to include (default: 4)

Source:
Returns:

byte representation of the given @see val

Type
Array.<Number>

leVal(bytes, nbytes) → {Number}

Calculates value of little endian ordered bytes.

leVal([ 0x00, 0x00, 0x00, 0x00 ]) // => 0x00 00 00 ff (            0)
leVal([ 0x01, 0x00, 0x00, 0x00 ]) // => 0x00 00 00 ff (            1)
leVal([ 0xff, 0x00, 0x00, 0x00 ]) // => 0x00 00 00 ff (          255)
leVal([ 0x00, 0x01, 0x00, 0x00 ]) // => 0x00 00 01 00 (          256)
leVal([ 0x01, 0x01, 0x00, 0x00 ]) // => 0x00 00 01 01 (          257)
leVal([ 0xff, 0x01, 0x00, 0x00 ]) // => 0x00 00 01 ff (          511)
leVal([ 0xff, 0xff, 0x00, 0x00 ]) // => 0x00 00 ff ff (       65,535)
leVal([ 0x00, 0x00, 0xff, 0x00 ]) // => 0x00 ff 00 00 (   16,711,680)
leVal([ 0xff, 0xff, 0xff, 0x00 ]) // => 0x00 ff ff ff (  16,777,215 )
leVal([ 0x00, 0x00, 0x00, 0x0f ]) // => 0x0f 00 00 00 ( 251,658,240 )
leVal([ 0x00, 0x00, 0x00, 0xf0 ]) // => 0xf0 00 00 00 (4,026,531,840)
leVal([ 0x00, 0x00, 0x00, 0xff ]) // => 0xff 00 00 00 (4,278,190,080)
leVal([ 0xff, 0xff, 0xff, 0xff ]) // => 0xff ff ff ff (4,294,967,295)
Parameters:
Name Type Argument Description
bytes Array.<Number>

bytes that contain number representation

nbytes Number <optional>

number of bytes, if not given it is deduced

Source:
Returns:

number contained in bytes

Type
Number

overflow(op1, op2, res, nbytes) → {Boolean}

Calculates if an overflow occurred due to the last arithmetic operation.

The overflow flag is set when the most significant bit (sign bit) is changed by adding two numbers with the same sign or subtracting two numbers with opposite signs.

A negative result out of positive operands (or vice versa) is an overflow.

overflow flag

Parameters:
Name Type Description
op1 Number

first operand of the arithmetic operation

op2 Number

second operand of the arithmetic operation

res Number

result of the arithmetic operation

nbytes Number

byte sizes of the operands and the result

Source:
Returns:

true if an overflow occurred, otherwise false

Type
Boolean

parity(v) → {Number}

Calculates parity of a given number and returns value to set parity flag to.

Mostly used to check for serial data communications correctness checking:

parity bit, or check bit is a bit added to the end of a string of binary code that indicates whether the number of bits in the string with the value one is even or odd. Parity bits are used as the simplest form of error detecting code. To determine odd parity if the sum of bits with a value of 1 is odd, the parity bit's value is set to zero.

Summary - parity flag is set to 0 if the number of set bits is odd - parity flag is set to 1 if the number of set bits is even

This method takes around 9 operations, and works for 32-bit words. It first shifts and XORs the eight nibbles of the 32-bit value together, leaving the result in the lowest nibble of v. Next, the binary number 0110 1001 1001 0110 (0x6996 in hex) is shifted to the right by the value represented in the lowest nibble of v. This number is like a miniature 16-bit parity-table indexed by the low four bits in v. The result has the parity of v in bit 1, which is masked and returned.

bithacks

x86 parity only applies to the low 8 bits x86 caveat

Parameters:
Name Type Description
v Number

32-bit number to get parity for

Source:
Returns:

0 if odd, otherwise 1

Type
Number

registers::_createRegister(k)

Registers are stored as a 4 byte array in order to allow accessing sub registers like ax, ah and al easily.

The byte order is little endian to be consistent with how things are stored in memory and thus be able to use the same store/load functions we use for the latter.

As an example eax is stored as follows:

this._eax = [
0x0 // al
, 0x0 // ah
, 0x0 // lower byte of upper word
, 0x0 // upper byte of upper word
]

Each register part can be accessed via a property, i.e. regs.ah, regs.ax.

Parameters:
Name Type Description
k String

the name of the register

Source:

registers::assign(regs)

Assigns given registers with the supplied values. Leaves all other flags alone.

Parameters:
Name Type Description
regs
Source:

registers::clearFlag(flag)

Clears a given flag

First we invert the mask for the flag to clear. Then we and the flags with that mask which clears our flag since that's the only bit in the mask that's 0.

Parameters:
Name Type Description
flag
Source:

registers::getFlag(flag) → {Number}

Returns a given flag

First masks out the bit of the flag we are interested in and then shifts our flag bit into lowest bit.

Parameters:
Name Type Description
flag
Source:
Returns:

1 if flag is set, otherwise 0

Type
Number

registers::setFlag(flag)

Sets a given flag

ors flags with mask that will preserve all other flags and set our flag since that bit is set in the mask.

Parameters:
Name Type Description
flag
Source:

signed(v, nbytes) → {Boolean}

Determines if a number is signed, i.e. the most significant bit is set

bithacks

Parameters:
Name Type Description
v Number

to check for signedness

nbytes Number

size of the value in bytes

Source:
Returns:

true if number is signed, otherwise false

Type
Boolean
*generated with [docme](https://github.com/thlorenz/docme)*

License

GPL3