llvm-mos / llvm-mos

Port of LLVM to the MOS 6502 and related processors
Other
410 stars 46 forks source link

[65816] 24-bit address space support #320

Open asiekierka opened 1 year ago

asiekierka commented 1 year ago

Part of https://github.com/llvm-mos/llvm-mos/issues/32 . This issue is a (simpler) specialization of the problem space presented in https://github.com/llvm-mos/llvm-mos/issues/319 , but still requires a lot of consideration with regards on how to implement it.

Note that this issue alone, without implementing 16-bit register modes, allows for the development of 65816 targets; the only remaining distinction for 65816 "native mode" is adjusting llvm-mos-sdk's crt0 to initialize the stack pointer to $01FF rather than $FF on such targets.

asiekierka commented 11 months ago

The 65816 architecture distinguishes the following pointer types:

The LLVM data layout ought to, at minimum, distinguish the following address spaces:

There could be some merit in adding additional address spaces for "program bank" and "stack-relative" near pointers - gcc-ia16 has done this to great effect to catch some classes of bugs caused by implicitly casting a stack-relative pointer to a data bank pointer; this doesn't need to be done as part of this issue, though.

There could also be some merit in adding an equivalent to the 8086's code models, which would determine what address spaces various content is placed in implicitly: small (near .text, near .data), medium (far .text, near .data), compact (near .text, far .data), large (far .text, far .data).

Importantly, the 65816 need not worry about data accesses conflicting with code execution - the program and data bank are distinct.

asiekierka commented 6 months ago

A forum post on 6502.org raises the point that 32-bit pointers are actually cheaper to handle than 24-bit pointers; 24-bit pointer arithmetic requires one 16-bit and one 8-bit access, which incurs a penalty in switching CPU modes. They're also a bit nicer to model from a C perspective.

As such, it might be worth distinguishing 32-bit pointers in address space 0, and 24-bit ("packed") pointers in address space 3.

Also, the formal 65816 terminology is direct page address, absolute address and long address.

mlund commented 6 months ago

A forum post on 6502.org raises the point that 32-bit pointers are actually cheaper to handle than 24-bit pointers.

Not sure if relevant for the design here, but mega65 use 28-bits. From the MEGA65 book:

Screenshot 2024-03-05 at 21 51 09
asiekierka commented 5 months ago

Proposed LLVM address space layout for 65816 CPUs: