Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 555 forks source link

Text::Balanced bugs + excessive regex recompilation, Filter::Simple bug #19469

Open mohawk2 opened 2 years ago

mohawk2 commented 2 years ago

Module: Text::Balanced, Filter::Simple

Description Text::Balanced has a couple of bugs, and also excessive regex recompilation (which happens whenever the Perl scalar in an RE changes, which is unfortunate if one has a for loop over regexes with /\G($regex)/g as Text::Balanced does). This was revealed by PDL's use of Filter::Simple, which also has a minor bug that is not worth PR-ing a fix for (being blead-first) until the Text::Balanced changes are merged and incorporated into blead (and released to CPAN also for both, please! So PDL can delete its excessive monkey-patching).

Steps to Reproduce For the bugs, see test patches in https://github.com/steve-m-hay/Text-Balanced/pull/6 (false-positive on heredoc with $x <<= 1, false-positive quotelike with {y=>1}...$x=2;$f=3; treated as y///).

The y_quotelike test also shows that Filter::Simple needs to change its $variable condition to be its own \&extract_variable in order to correctly process that snippet. Once T:B is updated in blead, I will (or someone else is welcome to) make a PR for that change too.

For the recompilation problem, loading PDL::LinearAlgebra with PDL versions 2.063_04-2.068 (monkey-patched in 2.069 with the code in https://github.com/steve-m-hay/Text-Balanced/pull/5) would take 10s due to recompiling Filter::Simple regexes 250k times, as revealed by the amazing Devel::NYTProf.

Expected behavior Correct recognition of Perl constructs without unneeded recompilation of regexes.

Perl configuration Text::Balanced and Filter::Simple haven't changed in ages, so 5.34 and 5.32 at least.

jkeenan commented 2 years ago

Text::Balanced is maintained upstream on CPAN, so its bugs need to be reported to RT.

Filter::Simple is in dist/, which means its maintained by P5P. So we can accept bug reports here.

jkeenan commented 2 years ago

Text::Balanced is maintained upstream on CPAN, so its bugs need to be reported to RT.

Filter::Simple is in dist/, which means its maintained by P5P. So we can accept bug reports here.

That being said, @mohawk2 writes, "Filter::Simple ... also has a minor bug that is not worth PR-ing a fix for (being blead-first) until the Text::Balanced changes are merged and incorporated into blead ...." But that means that this ticket contains no bug reports that are directly actionable by P5P.

@mohawk2, do you want to file a bug report about Filter-Simple? Otherwise, this ticket should be closed.

mohawk2 commented 2 years ago

Frankly, I opened this issue because for T:B, the RT with its 16-year old open tickets did not inspire confidence, and I hoped this would prompt some action. Filter::Simple has bugs (plural) in its code_no_comments handling that are caused by Text::Balanced, which I have submitted fixes for. The cases are shown in the PRs. Are you saying you think that makes this closable?

demerphq commented 2 years ago

On Mon, 28 Feb 2022 at 21:01, mohawk2 @.***> wrote:

Frankly, I opened this issue because for T:B, the RT with its 16-year old open tickets did not inspire confidence, and I hoped this would prompt some action. Filter::Simple has bugs (plural) in its code_no_comments handling that are caused by Text::Balanced, which I have submitted fixes for. The cases are shown in the PRs. Are you saying you think that makes this closable?

What he is saying is that while Text::Balanced is distributed with perl as convenience and because we use it as part of the build process it is not maintained by the porters. That is why it is contained in the cpan/ directory of the Perl project. If you feel the module has been abandoned then you can request to the @.*** list that we take over maintenance, which we may do, or find someone else to maintain it (would you like to volunteer?). But normally we do not make changes to cpan modules independently from cpan itself, we just copy the latest version posted into the perl distribution. I have cc'ed this mail to the list for their attention. If the ownership is to change we would have to go through a process with the owners of PAUSE and etc before it would happen. (I actually don't know the details of this, but it has happened before.)

cheers, yves

mohawk2 commented 2 years ago

I would be happy to help with Text::Balanced if @steve-m-hay is having a busy period (which we all have from time to time) and would like to grant me collaborator status on the repo. I see that first-come on it is held by DMANURA who seems to have only been active on PAUSE during 2005. I would be happy to have co-maint to release this, and would in any case suggest that the first-come on it be shifted to someone more suitable (quite possibly SHAY).

Once a CPAN release were out, I assume it would be a formality to update the Perl version, at which point I could then update Filter::Simple (and then hopefully that could then be released).

demerphq commented 2 years ago

On Tue, 1 Mar 2022, 12:19 mohawk2, @.***> wrote:

I would be happy to help with Text::Balanced if @steve-m-hay https://github.com/steve-m-hay is having a busy period (which we all have from time to time)

He has been busy releasing the latest maintenance build, perl 5.34.1, so he might not have even noticed this ticket.

and would like to grant me collaborator status on the repo. I see that

first-come on it is held by DMANURA who seems to have only been active on PAUSE during 2005. I would be happy to have co-maint to release this, and would in any case suggest that the first-come on it be shifted to someone more suitable (quite possibly SHAY).

I can't speak for Steve but if it were me I would welcome help.

Once a CPAN release were out, I assume it would be a formality to update

the Perl version,

It would get updated in the next minor version change I believe.

at which point I could then update Filter::Simple (and then hopefully that

could then be released).

Makes sense to me.

I cc'd Steve on this mail.

Cheers Yves

demerphq commented 2 years ago

On Tue, 1 Mar 2022 at 04:41, demerphq @.***> wrote:

On Tue, 1 Mar 2022, 12:19 mohawk2, @.***> wrote:

I would be happy to help with Text::Balanced if @steve-m-hay is having a busy period (which we all have from time to time)

He has been busy releasing the latest maintenance build, perl 5.34.1, so he might not have even noticed this ticket.

I had noticed, but hadn't yet had time to respond. Apologies for this.

and would like to grant me collaborator status on the repo. I see that first-come on it is held by DMANURA who seems to have only been active on PAUSE during 2005. I would be happy to have co-maint to release this, and would in any case suggest that the first-come on it be shifted to someone more suitable (quite possibly SHAY).

I can't speak for Steve but if it were me I would welcome help.

I'd be happy to grant co-maint (assuming I'm able to do that). I only took on maintenance of this module in 2015 in order to make a release to sync with bleadperl since it had fallen out of maintenance. I've since done minimal work on it, but remain in a position to take essential bug fixes and make new releases, mainly with a view to getting important fixes into the core perl distro.

If you are someone with a more active interest in it and able to respond more quickly to such issues then you're very welcome to help out. What is your CPAN ID and I'll see if I can grant co-maint.

mohawk2 commented 2 years ago

I am ETJ, but DMANURA has first-come as of this writing (see https://metacpan.org/dist/Text-Balanced/permissions) so this will require the PAUSE gods' intervention (since I consider it unlikely DMANURA even still has their credentials from 2005) - I would strongly suggest first-come be shifted to SHAY. @neilb ? :-}

@demerphq Thank you for interlocuting here - if @steve-m-hay can add me as collaborator on the repo then we can get this thing moving!

steve-m-hay commented 2 years ago

On Tue, 1 Mar 2022 at 22:46, mohawk2 @.***> wrote:

I am ETJ, but DMANURA has first-come as of this writing (see https://metacpan.org/dist/Text-Balanced/permissions) so this will require the PAUSE gods' intervention (since I consider it unlikely DMANURA even still has their credentials from 2005) - I would strongly suggest first-come be shifted to SHAY. @neilb ? :-}

@demerphq Thank you for interlocuting here - if @steve-m-hay can add me as collaborator on the repo then we can get this thing moving!

@mohawk2 - You should have just received an invite as a collaborator on the repo.

demerphq commented 2 years ago

On Tue, 1 Mar 2022 at 23:46, mohawk2 @.***> wrote:

I am ETJ, but DMANURA has first-come as of this writing (see https://metacpan.org/dist/Text-Balanced/permissions) so this will require the PAUSE gods' intervention (since I consider it unlikely DMANURA even still has their credentials from 2005) - I would strongly suggest first-come be shifted to SHAY. @neilb https://github.com/neilb ? :-}

@demerphq https://github.com/demerphq Thank you for interlocuting here

No thanks required, I just explained the process and CC'ed Steve. I am glad you and he are working things out, sounds like a success story to me! At least in the long run.

cheers, Yves

mohawk2 commented 2 years ago

You should have just received an invite as a collaborator on the repo.

I have just accepted it - thanks! Now for the minor matter of PAUSE permissions...

demerphq commented 2 years ago

On Wed, 2 Mar 2022 at 03:06, mohawk2 @.***> wrote:

You should have just received an invite as a collaborator on the repo.

I have just accepted it - thanks! Now for the minor matter of PAUSE permissions...

I think that needs to be done by Andreas, who I have included in this email.

cheers, Yves

mohawk2 commented 2 years ago

I've just uploaded Text::Balanced 2.04_01. Naturally, we'll need to know it makes things better without breaking anything. Is there some procedure where I could PR updating Perl's T:B plus an update to Filter::Simple, and it would get smoke-tested? I've tried my updated version locally on various things and it seems Ok so far.

mohawk2 commented 2 years ago

(The dev-release includes fixes for just about all the open RT tickets)

demerphq commented 2 years ago

On Mon, 7 Mar 2022 at 07:31, mohawk2 @.***> wrote:

I've just uploaded Text::Balanced 2.04_01. Naturally, we'll need to know it makes things better without breaking anything. Is there some procedure where I could PR updating Perl's T:B plus an update to Filter::Simple, and it would get smoke-tested? I've tried my updated version locally on various things and it seems Ok so far.

Not 100% of the procedure, or if you need someone to do it for you, or if you can do it yourself. But we have smoke-me branches in blead and a tool in Porting to update blead with the latest modules. You could try doing that and pushing a PR, if the smoke-me branch needs to be in the core repo I or one of the other committers can arrange that for you.

cheers, yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

jkeenan commented 2 years ago

Today, in order to move the discussion in this ticket forward, I created the following branch in our repository:

smoke-me/jkeenan/text-balanced-gh-19469

For this branch I took the HEAD of the master branch of Text-Balanced's GH repository (version "2.04_02"), created a tarball from it, then synched that tarball into my branch.

perl Porting/sync-with-cpan --tarball $GIT_WORKDIR/Text-Balanced-1/Text-Balanced-2.04_02.tar.gz Text::Balanced

I then used the MetaCPAN API to get a list of Text-Balanced's reverse dependencies on CPAN:

curl -XPOST https://fastapi.metacpan.org/v1/release/_search -d '{
  "size": 5000,
  "fields": [ "distribution" ],
  "filter": {
    "and": [
      { "term": { "dependency.module": "Text::Balanced" } },
      { "term": {"maturity": "released"} },
      { "term": {"status": "latest"} }
    ]
  }
}' 

I massaged the results until I got a list of modules in Some::Module format that I could feed into cpanm installed against a perl built from the smoke-me/jkeenan/text-balanced-gh-19469 branch.

Several hours later ... I had installed approximately 800 CPAN distributions. I examined the .cpanm/latest-build/build.log for failures. I could not attribute any failures to this new version of Text-Balanced.

Which leads to the following questions:

@rjbs, if a new version (2.05) of Text-Balanced were to be released to CPAN in, say, the next two weeks, and do well on CPANtesters, would it be eligible for synching into blead before the 5.35.11 release anticipated for April 20?

@mohawk2, @steve-m-hay, if the answer to the previous question is Yes, do you believe that Text-Balanced is in good enough shape for a new CPAN release and can you make that happen?

mohawk2 commented 2 years ago

I believe it is in good enough shape, though I do not have permissions to CPAN-release it.

steve-m-hay commented 2 years ago

CPAN tests seem to be going well, so I agree it's in good shape to release, which I will try to find the time to do soon. It seems a bit late to go into blead given its code freeze state already though, but we can certainly have a new release on CPAN in time for the 5.36.0 release, so anyone updating their perls can grab a new version of Text::Balanced while they're at it.

mohawk2 commented 2 years ago

I'd agree it shouldn't go in blead due to code-freeze. My problem (as PDL maintainer) will very much be solved with a CPAN release, since I can depend on it and delete the literally 500 lines of monkey-patch in there now.

PDL has further monkey-patches to Filter::Simple which I'd like to fix properly, but I am proposing to deal with that as a separate stage (and in fact it will be very small by comparison since the hard work is all in Text::Balanced). With a view to concurrently progressing that, @tsee are you provisionally happy to consider a similar update to Filter::Simple?

mohawk2 commented 2 years ago

Hopefully several weeks later isn't too soon to ask: any progress? If I could get the co-maint status, I could not only release T:B 2.05, but also close the RT fixed thereby.

That would then unlock updating blead's T:B, and me fixing Filter::Simple.

steve-m-hay commented 2 years ago

Hopefully several weeks later isn't too soon to ask: any progress? If I could get the co-maint status, I could not only release T:B 2.05, but also close the RT fixed thereby.

That would then unlock updating blead's T:B, and me fixing Filter::Simple.

I will do this soon, but since it wasn't required for 5.36.0 I didn't see a great rush for it.

neilb commented 2 years ago

I've pinged DMANURA, and have suggested that we transfer the first-come to P5P, so granting of co-maints can then be a PSC issue.

Hopefully if he responds promptly, I'll make the permissions changes. If I don't hear back from him, I'll at least grant ETJ co-maint.

steve-m-hay commented 2 years ago

@mohawk2 I've just released Text-Balanced-2.05 to CPAN. Thanks for all your work on this. Apologies for taking longer than intended to make this release.

mohawk2 commented 2 years ago

@steve-m-hay Amazing, thank you! Now I can update PDL to use the updated version, and start on the Filter::Simple work (the second half of the reported problem) - I believe that's blead-first, but would welcome corrections if I'm wrong. I see you've also closed a number of the RT bugs, thank you.

mohawk2 commented 2 years ago

Note to self: Filter::Simple's $EOP needs the \n# case for the case of:

=cut
#line blah
mohawk2 commented 2 years ago

This will need fixing (a breakage for Parse::RecDescent: https://rt.cpan.org/Ticket/Display.html?id=142922) before I look at Filter::Simple.

mohawk2 commented 2 years ago

The above is now fixed (including adding CI to test at least those downstream modules identified by those issues), and a dev-release (2.05_01) has been uploaded to CPAN.

@steve-m-hay In due course (and assuming CPAN Testers likes it), could you make a new full release on CPAN? Meanwhile, I believe nothing is in the way of me having a go at Filter::Simple.

steve-m-hay commented 2 years ago

The above is now fixed (including adding CI to test at least those downstream modules identified by those issues), and a dev-release (2.05_01) has been uploaded to CPAN.

@steve-m-hay In due course (and assuming CPAN Testers likes it), could you make a new full release on CPAN? Meanwhile, I believe nothing is in the way of me having a go at Filter::Simple.

Tests are looking good so far. I will make a new stable release soon.

mohawk2 commented 2 years ago

Once the new stable release is out, I will update the Debian package accordingly.

steve-m-hay commented 2 years ago

Once the new stable release is out, I will update the Debian package accordingly.

I have just pushed Text-Balanced-2.06 to CPAN :-)

mohawk2 commented 2 years ago

The Debian packaging is progressing nicely (with lots and lots of support from the experienced Debian folks!).