andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 63 forks source link

Segfault on FPC when FPC_REQUIRES_PROPER_ALIGNMENT is enabled #251

Open foxpas opened 3 years ago

foxpas commented 3 years ago

Thanks for the great library.

I just wanted to point out an issue on platforms for which FPC defines FPC_REQUIRES_PROPER_ALIGNMENT. In my case I am using TRegExpr on macOS with FPC.

When compiling for macOS x86_64 FPC_REQUIRES_PROPER_ALIGNMENT is not defined and TRegExpr works fine.

When compiling for macOS aarch64 FPC_REQUIRES_PROPER_ALIGNMENT is defined, which makes TRegExpr use pointer alignment. The following code causes a segfault:

X := TRegExpr.Create('^([0-9A-F]{2}[:-]?){5}([0-9A-F]{2})$');
X.Exec('aabbccddeeff')

I can work around this by undefining FPC_REQUIRES_PROPER_ALIGNMENT, so TRegExpr will not align pointers for aarch64. It works, but to the best of my knowledge when FPC_REQUIRES_PROPER_ALIGNMENT is defined, pointers should be aligned.

Therefore, there seems to be a bug in TRegExpr on platforms that require aligned pointers.

Alexey-T commented 3 years ago

Regexpr.pas has several usages of that define, e.g.

  {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  // add space for aligning pointer
  // -1 is the correct max size but also needed for InsertOperator that needs a multiple of pointer size
  RENextOffSz = (2 * SizeOf(TRENextOff) div SizeOf(REChar)) - 1;
  REBracesArgSz = (2 * SizeOf(TREBracesArg) div SizeOf(REChar));
  // add space for aligning pointer

function AlignToPtr(const p: Pointer): Pointer; {$IFDEF InlineFuncs}inline;{$ENDIF}
begin
  {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  Result := Align(p, SizeOf(Pointer));
  {$ELSE}
  Result := p;
  {$ENDIF}
end;

function AlignToInt(const p: Pointer): Pointer; {$IFDEF InlineFuncs}inline;{$ENDIF}
begin
  {$IFDEF FPC_REQUIRES_PROPER_ALIGNMENT}
  Result := Align(p, SizeOf(integer));
  {$ELSE}
  Result := p;
  {$ENDIF}
end;

please help to correct the code inside, as I don't know how to do that. Maybe @user4martin, @pascaldragon can help.

Alexey-T commented 3 years ago

I don’t have aarch64 hardware so cannot help.

User4martin commented 3 years ago

I don't know exactly what the code is trying to do....

I only had a very quick look at it / not much time...

https://github.com/andgineer/TRegExpr/blob/c022defcfa9d431818eff920d8159edde15ca7b8/src/regexpr.pas#L899

https://github.com/andgineer/TRegExpr/blob/c022defcfa9d431818eff920d8159edde15ca7b8/src/regexpr.pas#L2263-L2264

So the 2nd bit of code does the following.

So then

Now, for that bit of code, I would guess (and really guess, as I have not looked beyond those few lines), that it should be function AlignToPtr(var p: Pointer): Pointer

and should modify p, to the same as the result.

There may be more issues like the one above....

User4martin commented 3 years ago

If you only do aligns as the above -> that is align individual values in an allocated memory block/stream -> try forcing FPC_REQUIRES_PROPER_ALIGNMENT on for your test build. Even if not required, you should be allowed to align values.

Howewer if you do stuff like

type
  TRec = record 
    b: byte; 
    p: Pointer; 
  end;
  PRec = ^TRec;
var 
  t: TRec;

alignPtr(Pointer(@t) + sizeOf(byte))! := nil;

So you try to match the mem layout of a compiled in record, then you can not force it on. Because the record will still be unaligned on your test system.

Instead do

Pointer(@t) + PtrUInt(@Prec(nil)^.p)
Alexey-T commented 3 years ago

I guess you are right, but I don't have the aarch64 hardware to test this.

foxpas commented 3 years ago

I guess you are right, but I don't have the aarch64 hardware to test this.

I have a VM running macOS 11 on aarch64, with FPC/Lazarus installed. I can give you SSH and RDP access to that machine. Would you be willing to take a look at this issue?

User4martin commented 3 years ago

@Alexey-T I believe some fixes can be done on intel/amd.

Just compile the code with -dFPC_REQUIRES_PROPER_ALIGNMENT (force it on). And then fix the issues on intel/amd.

(Search does not show any "matching inside a record" aligns, so that should work)


At least that will catch the issue I described. There may be missing aligns? Those need correct hardware to detect.

Alexey-T commented 3 years ago

Now I have the aarch64 hardware. Raspberry Pi 3 with 64-bit Ubuntu Server ARM64. I see that test_fpc project fails on 4-5 tests (if running as is).

Ok, but what is this?

I can work around this by undefining FPC_REQUIRES_PROPER_ALIGNMENT, so TRegExpr will not align pointers for aarch64. It works,

I added {$undef FPC_REQUIRES_PROPER_ALIGNMENT} to the beginning of Regexpr module. Yes I confirm on aarch64 that it works (ie test_fpc project runs all tests OK)! So we don't really need that handling of FPC_REQUIRES_PROPER_ALIGNMENT ? seems so. I don't understand why it works? :)

@user4martin Do you know why it works?

Alexey-T commented 3 years ago

I posted the pull req which undefines that word.

foxpas commented 3 years ago

AArch64 permits unaligned access to regular memory, so technically FPC_REQUIRES_PROPER_ALIGNMENT does not need to be defined at all. This is documented on ARM's website and mentioned in this discussion.

So locally undefining FPC_REQUIRES_PROPER_ALIGNMENT for RegExpr will make it work on AArch64. Having said that, on any platform that does not permit unaligned access RegExpr may not work regardless of this change. That would be some minor/unpopular CPUs.

Alexey-T commented 3 years ago

Let’s keep this opened until I find the hardware (which requires aligned access)...