SamCoVT / TaliForth2

A Subroutine Threaded Code (STC) ANSI-like Forth for the 65c02
Other
28 stars 4 forks source link

C65 enhancements #89

Closed patricksurry closed 2 months ago

patricksurry commented 2 months ago

This is overkill but I went down an enjoyable parsing rabbithole...

Most addresses and values can now be c-style expressions. (arithmetic, logical ops, comparsion, ternary ?:) Labels can be used as symbols, as are the CPU registers (a, x, y, sp, pc) and flags (c, z, ...). There are a couple of special operators: unary < and > for extracting low byte/high byte like most assemblers, and @ and for dereferencing memory as a word or byte. For example `labelmeans the byte stored at label, and>(@(*(pc+1))+x)` means the high byte of the address formed by adding X to the address stored at the zp location stored at pc+1.

The ~ command becomes a pretty useful calculator for checking expressions and walking through memory.

Ranges are now written as start.end (wozmon style) or start .. offset since the old : and / are used by expressions. Spaces are less important, e.g. you can write dis pc+4 .. x << 1 to disassemble 2*x bytes starting at pc+4.

also adds heatmap and inspect, plus trigger irq|nmi|reset.

patricksurry commented 2 months ago

@SamCoVT if you get a chance take a look at the heatmap command and susbsequent disasm. super useful for exploring code usage and critical sections etc.

I updated the README to include the new stuff

SamCoVT commented 2 months ago

I get a few warnings when compiling c65 on Linux. Most are just differing integer types (eg. uint64_t is a "long unsigned int" while printf is using %llx (for a "long long unsigned int", which is apparently not the same size on my system). This is always a pain in C, where different compilers have different notions about the size of types (stdint.h was supposed to fix this, but I'm not aware of any updates to the format specifiers for printf to make it aware of these types).

The only one that looks interesting is this one:

monitor.c:144:25: warning: suggest parentheses around comparison in operand of ‘&’ [-Wparentheses]
  144 |     for(addr=start; addr<start + (1024 << zoom) & addr < 0x10000; addr++) {
      |                     ~~~~^~~~~~~~~~~~~~~~~~~~~~~

Did you mean && there instead of &?

I do like the heatmap command. After running words a few times, you can clearly see where the dictionary headers are in "ROM". I'll definitely play with that when I have more time.

The legend says 40 bytes/char... should that be $40 bytes/char?

patricksurry commented 2 months ago

weird, I don't get that paren warning with -Wall.  ~I think single & is correct there but will check on it and add a comment~ [yes, that's a bug, still can't get my gcc to tell me]

I debated $40 v 40 but left it off since everything is in hex by default.  it does look weird tho, maybe it's worth adding $ throughout the legend [added back]

wasn't aware of that printf issue - it looks like there might be some inttypes macros to embed the correct format (https://stackoverflow.com/questions/9225567/how-to-portably-print-a-int64-t-type-in-c)

SamCoVT commented 2 months ago

Looks like you already found it while I was typing:

It seems there is a portable way to print stdint.h types. inttypes.h has macros that expand to the correct printf format specifier for the compiler/target being used. They start with PRI, then d (or i), u, o, x, or X, then the width in bits, eg.

printf("Here is a number in lowercase hex: %" PRIx64 "\n", thevaue);

This is relying on the fact the two constant strings with no operator in between are concatenated in C. Note that there needs to be a space after the " and before the PRI because letters right after the closing quote have meaning.

SamCoVT commented 2 months ago

The legend is much clearer now - I didn't realize the counts were in hex as well, and now the intervals make much more sense (essentially log scale). I also just noticed the heatmap is extended into the disassembly - that is super slick and I can see where that would help with optimizations. You can also see how tmp1, tmp2, and tmptos are used a ton in zero page, in order of usage, but tmp3 not so much.

patricksurry commented 2 months ago

yes, i like how you can deduce which conditional branches tend to be taken or not, and see where the critical loops are.

not that it matters for this codebase where we have labels, but it's also cool that you can distinguish opcodes v operands v data by comparing heat x with heat r and heat w after you've let it run for a bit.