Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.95k stars 554 forks source link

remove pointless SVPV mortal alloc in S_strip_spaces()/prototype parser #22655

Open bulk88 opened 1 week ago

bulk88 commented 1 week ago

See commit message. Since I minimally know about the p5 prototype syntax grammar, is there an actual max_len to that string, and it is STILL parseable/not fatal later on? Other words, can string limit be FIXED, and that temp SVPV statement removed forever, adding a hard error, and using tiny stack buf only?

10MB, 100MB, and 1GB long prototype strings are welcome. Discussing prototype max_len verges on a Perl GOLF or anti-GOLF or fuzzing/sec exploit category.

iabyn commented 1 week ago

I don't really understand the motivation behind this PR. It seems to add a lot of complexity: not only in the static function, but in all its callers. And just to save creating a mortal SV string a couple of times per sub-with-prototype compilation? Am I missing something obvious?

bulk88 commented 4 days ago

I don't really understand the motivation behind this PR. It seems to add a lot of complexity: not only in the static function, but in all its callers. And just to save creating a mortal SV string a couple of times per sub-with-prototype compilation? Am I missing something obvious?

The mortal SV and malloc are being created and freed, even if the string contents are "clean" and the string contents will not change. That is a bug.

Perl prototypes are never more than ~10 chars long I think. It is senseless to, for EACH prototype parsed, to use malloc buffer+SV Body+SV head+free all 3 allocs, to parse the prototype, that 99% of the time, the string contents, will not change. A reasonably tiny, 16/32/64/128 C stack char array buffer is much better and faster. I left in the SVPV alloc code since the old code did it, but IMHO the SVPV alloc code should be removed and replaced by a bounds check and throw fatal parsing error.

If there are no comments, I will slim down the patch, and just use a C auto 32 byte char array or fallback to SVPV mortal, everywhere. Therefore less new C code by ASCII text .c file char length.

Ideally the SVPV mortal fallback code should be removed, but IDK what is the longest perl prototype that is not a grammar syntax error, and is not a security exploit/buffer overflow?

mauke commented 4 days ago

what is the longest perl prototype that is not a grammar syntax error, and is not a security exploit/buffer overflow?

Probably Web::Simple match specification or similar: https://metacpan.org/pod/Web::Simple#Web%3A%3ASimple-match-specifications

Those can be given as subroutine prototypes: https://metacpan.org/pod/Web::Simple#:~:text=An%20alternative%20to%20using%20string,the%20sub%20prototype

bulk88 commented 4 days ago

Those can be given as subroutine prototypes: https://metacpan.org/pod/Web::Simple#:~:text=An%20alternative%20to%20using%20string,the%20sub%20prototype

Examples in the POD seem to max out at 32 chars, 80 chars is line limit in Terminal. ~120 chars is line limit in a desktop URL box before being cut off.

so 32/64 chars is average? extreme but some production code does it is 128? Perl syntax/grammer valid is unlimited length until OOM from malloc?

iabyn commented 3 days ago

On Thu, Oct 17, 2024 at 09:40:21AM -0700, bulk88 wrote:

I don't really understand the motivation behind this PR. It seems to add a lot of complexity: not only in the static function, but in all its callers. And just to save creating a mortal SV string a couple of times per sub-with-prototype compilation? Am I missing something obvious?

The mortal SV and malloc are being created and freed, even if the string contents are "clean" and the string contents will not change. That is a bug.

I don't see how that's a bug. I looks to me like just a minor inefficiency.

My main point remains. Unless I am misunderstanding, these temporary buffers are only created one or twice during the compilation of a sub, and for that matter, only for a sub which has a prototype specified (which is very much a minority of subs).

So I see very little that is gained by optimising this function. Conversely, I see a lot of downside, especially as it makes the callers more complex, not just the function itself.

If you're really bothered by it, why not just make S_strip_spaces() do a quick scan for spaces and return the original prototype string if none found. That leaves the temp malloc just for the edge cases where the prototype hasn't been declared in perl, but presumably via some sort of XS route.

-- "Strange women lying in ponds distributing swords is no basis for a system of government. Supreme executive power derives from a mandate from the masses, not from some farcical aquatic ceremony." -- Dennis, "Monty Python and the Holy Grail"

bulk88 commented 3 days ago

New patch, alot less lines of text. More basic. Still gets all goals accomplished.

iabyn commented 1 day ago

On Fri, Oct 18, 2024 at 07:46:26AM -0700, bulk88 wrote:

New patch, alot less lines of text. More basic. Still gets all goals accomplished.

And it still doesn't address any of the points I raised. The main ones being that:

-- That he said that that that that is is is debatable, is debatable.