vralfy / phpcsmd

Netbeans plugin to report code measurements generated by phpcs, phpmd, phpcpd and pdepend as Tasks and Annotations
27 stars 6 forks source link

Settings > PHP > PHPCSMD > General > ignore regex ... isn't regular expression'ing. #39

Closed jedgell closed 10 years ago

jedgell commented 10 years ago

Under the general settings, there's a field for an ignore regex. If you put a true regular expression in there, it fails. The method in use in code is [java.util.regex.Pattern.matches](http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#matches%28java.lang.String, java.lang.CharSequence%29) which is a convenience method that maps to java.util.regex.Matcher.matches after compiling the regular expression. The thing is...Matcher.matches() does not do true regular expression matching - it kinda does work, sorta. The string passed to Matcher.matches() as a regex might as well have ^ and $ wrapped around it, because it evaluates whether the passed in regex is the same as the string passed for comparison.

In phpcsmd/src/de/foopara/phpcsmd/generics/GenericHelper.java, you should be using Matcher.find(), in stead, as it actually runs a true comparison.

How I found this out: I elaborated on the pre-populated regex in the field to come up with the following:

\.(svn|git|DS_Store)|(nbproject|vendor|(doc|test)(s)?)|(README|LICENSE|CHANGELOG)(\.txt|\.md)?$|\.(js|(c|le|sc)ss|(sv|jp|pn)g|(tx|od)t|(y|pht|xht|ht|x)ml|(pd|gi)f|(b|g)?z(ip)?(2)?|java|sql|doc|tar|xls|csv)$

The failures started happening immediately on .Ds_Store and README files in my project directory. Here are a couple example paths:

I tested the regex in PHP and JS interpreters. It works. I found on online Java Regular Expression Test Page (because I don't write Java) and it was able to test the regular expression against Matcher's many methods and link me to the JavaDoc, where I read this:

Once created, a matcher can be used to perform three different kinds of match operations:

  • The matches method attempts to match the entire input sequence against the pattern.
  • The lookingAt method attempts to match the input sequence, starting at the beginning, against the pattern.
  • The find method scans the input sequence looking for the next subsequence that matches the pattern.

Each of these methods returns a boolean indicating success or failure.

Please, if you're going to offer me the opportunity to write my own regular expression, do not crimp my style. Give me the find method and lemme go hog wild.

I have no mechanism for testing any code I might supply (because I don't write Java), so here's my suggested resolution:

Starting at line 157, something like:

Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(file.getAbsolutePath().toLowerCase());

if (m.find()) {
    Logger.getInstance().logPre(file.getAbsolutePath() + " matches " + pattern + " " + retMatched, "ignore", Logger.Severity.EXCEPTION);
    return retMatched;
}

I'm sorry for the very long winded message. This is the third project I've run into this week where an author has said "Put regex here:", and when I did, something funny was happening behind the scenes, and the lovely regex I'd crafted didn't work.

jedgell commented 10 years ago

Are you kidding me? So, I went to the trouble of installing the extra plugin and debugging down through that bit and I missed that you're calling toLowerCase() on the file name as well? How would a user know this? I'll get you a pull request back to you to fix this, but don't tell people they're entering a regular expression, if all you want is a pipe delimited list.

vralfy commented 10 years ago

Thanks for your contribution ... This match-find issue is new to me.