possible solution for CPUs which require alignment

Alexey-T commented 1 year ago

@user4martin To solve AlignToPtr and all other Align* issues, we can change storage of opcode. Make it array of PtrInt. even small OpCode field (Word for UnicodeRE, byte otherwise) will be PtrInt. etc. it will need more memory for opcode array but much simpler! agree?

User4martin commented 1 year ago

Doesn't OP_EXACT store chars (potentially longer substrings)? Those then need to be copied into the memory, without extending each to longint size.

Size can matter. If the compiled regex fits into one (or less) cache lines, that can affect the execution speed.

Opcodes themself may even become an enum (may help the compiler optimizing the "case" statement. Though they would still be stored as char/byte/int ... whatever. So they would be typecasted.

Most opcodes do not need a "next" pointer. However it is dictated by the way the parser currently works. Something that currently is on the back on my mind for "someday, when I have time" (which may translate to never, or far future...)

User4martin commented 1 year ago

Fixing it based on the current concept requires a decision first (so it is probably already made).

The current code has fragments reminiscent of 2 different approaches.

1)

    offset := PRENextOff(AlignToPtr(p + REOpSz))^;
    Result := p + offset;

The current pointer, is increased to an aligned pos.

That is probably what most people would expect. And that is, what in the given context makes sense.

2) RENextOffSz = (2 * SizeOf(TRENextOff) div SizeOf(REChar)) - 1; Inc(regCode, RENextOffSz); PRENextOff(AlignToPtr(scan + REOpSz))^ := -(scan - val)

This is usually used, if you have only one instance, and have a fixed allocation of mem (e.g. on the stack, or in the middle of a fixed struct)

You allocate TWICE the space. (where space >= align)
your aligned data goes at the first aligned pos inside that space

E.g your size and align is 4. You allocate 8. Now if your pointer is at 17, then you have space from 17 to 24. The next align is at 20, and your data goes from 20 to 23. => all fine.

But that is way less efficient. Except, if you in most cases do not access the data, but only increase the counter. Because increasing the counter, then does not need the call ta "align".

I have not checked, but I think we do in most cases access the pointer (except in the first pass). So we could speed up the first pass of compiling, but would then over-allocate memory.

Currently the code has a mixture of both of the above, and some data is written without any align....

Alexey-T commented 1 year ago

yes, you are right. we will over-allocate memory, not good. so declined.

andgineer / TRegExpr

possible solution for CPUs which require alignment #331