mlawren / githook-perltidy

Run perltidy as a Git pre-commit hook
41 stars 10 forks source link

data paragraphs in PODs are mangled #1

Closed finn closed 12 years ago

finn commented 12 years ago

Occasionally data paragraphs in PODs are mangled during a tidying run. This is due to how the POD::Tidy module parses and cleans PODs in the codebase, but since there is no way to change or alter this behavior from githook-perltidy, I thought it was worth filing a bug on this project as well.

Long story short: Pod::Tidy should be ignoring everything in between =begin and =end, (data paragraphs) and leaving them as-is instead of wrapping them, but it isn't.

AFAICT it isn't even looking at or detecting the existence of of those format blocks. I went through the code of Pod::Tidy (which subclasses Pod::Wrap which subclasses Pod::Parser) and in all three of those modules I couldn't find any special cases for data paragraphs. It seems like it should be the responsibility of Pod::Wrap and Pod::Tidy to detect that a paragraph of text is a data paragraph and not perform any tidying on it. They are not examining the parse tree or anything else beyond whether a paragraph is a verbatim or a textblock. So what is happening is it is seeing chunks of text as textblocks (not double-checking to see if they are data paragraphs) and wraps them.

We discovered this because we had sample pieces of JSON in our own =begin and =end data blocks which were then reformatted during a tidy run.

http://perldoc.perl.org/perlpodspec.html#About-Data-Paragraphs-and-%22%3dbegin%2f%3dend%22-Regions

"Data paragraphs are typically used for inlining non-Pod data that is to be used (typically passed through) when rendering the document to a specific format"

mlawren commented 12 years ago

Can you provide the link to the bug report you made against Pod::Tidy? That is where I think most of the discussion on this topic should happen.

With regards to "since there is no way to change or alter this behavior from githook-perltidy" can you be more specific on the type of behaviour that is missing or can be improved in githook-perltidy? The ".podtidy-opts" file is not sufficient?

finn commented 12 years ago

Bug report on Pod::Tidy:

https://rt.cpan.org/Ticket/Display.html?id=78043

finn commented 12 years ago

My first impulse was to see if it was possible to only have Perl::Tidy turned on while turning off Pod::Tidy during the execution of the githook-perltidy script.

While the .podtidy-opts file is good for adjusting the behavior of Pod::Tidy, it has no effect on this bug. (Due to the fact that Pod::Tidy doesn't differentiate data paragraphs from verbatim or textblock paragraphs.)

As far as I can tell githook-perltidy always runs both the perl and POD tidying. Perhaps an option to turn off one or the other would be useful.

mlawren commented 12 years ago

I'm not sure that adding a feature as a work around for such an external bug makes good sense. The whole point of githook-perltidy after all is to tidy things up. Are there any other (other than this bug) reasons for turning off one or the other? On the other hand it is also kind of hard to ignore these kinds of problems as completely unrelated.

Running perltidy is a very integrated part of githook-perltidy. If that needs to be turned off I think temporarily removing the githook-perltidy calls from the Git hooks is best. Podtidy can be more easily made optional which I have just done in the "podtidybug" branch, but I'm not yet convinced that this should go into devel and eventually master. Perhaps it is enough for you to install off this branch until the Pod::Tidy bug is fixed?

mlawren commented 12 years ago

Forgot to mention that Pod::Tidy will not be used in the "podtidybug" branch if the ".podtidy-opts" file does not exist, but I didn't really think about this very hard. Would you prefer an environment variable?

finn commented 12 years ago

I hadn't realized that PODs were also being tidied until we ran into this bug. I guess my expectation was that the script would only run Perl::Tidy since it's name is similar to the perltidy script that ships with the Perl::Tidy module. From my point of view, the Pod::Tidy aspect is a nice add-on for those who want/need it, but not strictly necessary. But it's your code/module obviously!

Having the Pod::Tidy run skipped if the .podtidy-opts file is not present seems like a nice workaround for now.

mlawren commented 12 years ago

The podtidy part was documented, but your point about names makes sense. I'd like to follow the principle of least surprise so the podtidybug branch has now been merged into master, and a new version 0.10.2 released to CPAN.

I think I will eventually anyway replace "githook-perltidy" with a "githook" command that takes names of plugins to run before and after commits, because the "make" component has a similar "hidden feature" problem.