dryabov / phpregexp

Support of regular expressions parsing in PhpStorm
MIT License
4 stars 2 forks source link

Empty group/unclosed group with non-capturing group #4

Open lol768 opened 9 years ago

lol768 commented 9 years ago

Code sample:

if (preg_match('/^.*(?:\.htm)$/i', $fn)) {
    $isHtml = true;
}

IDE:

Regex101 output:

As far as I'm aware, the regex is valid. It works in production & when testing locally. Am I doing something wrong or is this a bug?

dryabov commented 9 years ago

Regex is valid.

What version of PhpStorm do you use? (I have 8.0.3 and there is no "Empty group" or "Unclosed group" inspection errors for your code snippet.) Do you use a 3rdparty regex inspection plugin?

lol768 commented 9 years ago

Using 8.0.3. As far as I'm aware, this is the only inspection plugin I'm using.

Full plugin list: Apache config (.htaccess) support, ASP, BashSupport, Behat Support, Blade Support, CamelCase, CoffeeScript, Command Line Tool Support, CSS Support, CVS Integration, Database Tools and SQL, Drupal Support, Dummy Text Generator, DynamicReturnTypePlugin, File Watchers, Framework MVC Structure Support, Gherkin, Git Integration, GitHub, GNU GetText files support (*.po), Google App Engine Support for PHP, HAML, Handlebars/Mustache, hg4idea, HTML Tools, IdeaVim, Ini4Idea, IntelliLang, Java Server Pages Integration, JavaScript Debugger, JavaScript Intention Power Pack, JavaScript Support, LESS support, Perforce Integration, Phing Support, PHP, PHP 1Up!, PHP RegExp Support, PHP Remote Interpreter, QuirksMode, Refactor-X, Remote Hosts Access, REST Client, ReStructuredText Support, SASS support, SSH Remote Run, Subversion Integration, Task Management, Terminal, TextMate bundles support, Twig Support, UML Support, Vagrant, W3C Validators, WordPress Support, XenForo Integration, XPathView + XSLT Support, XSLT-Debugger, YAML

If I disable PHP RegExp Support and restart PhpStorm, the errors go away.

lol768 commented 9 years ago

Just disabled a bunch of the plugins to see if they were causing it. New plugin list:

CSS Support, Database Tools and SQL, File Watchers, Framework MVC Structure Support, HTML Tools, PHP, PHP RegExp Support

Most of those are JetBrains plugins. I can still reproduce the issue with just the plugins above.

dryabov commented 9 years ago

Could you run "Inspect code" and look at what inspection generates this error (e.g. General/Annotator, etc.).

lol768 commented 9 years ago

dryabov commented 9 years ago

I've submitted version 0.9.2 to JetBrains plugins directory, and it will be available after moderation. Could you check this issue with new version?

PS. I'm still unable to reproduce this issue locally.

lol768 commented 9 years ago

I can still reproduce the issue with 0.9.2, downloaded from JetBrains site.

Is there anything I might've configured that could have changed the behaviour of the plugin?

dryabov commented 9 years ago

What is codepage of the file? From screenshot it looks like length of regex is 6 characters instead of 13 ones, so maybe codepage is UTF16 and it isn't processed correctly by the plugin (though it's unlikely as UTF16 is internal Java codepage and should be processed in the same manner as ASCII).

lol768 commented 9 years ago

The file does not include a byte order mark.

The relevant part of the file is encoded as:

0000330: 0a0a 0a0a 0a69 6620 2870 7265 675f 6d61  .....if (preg_ma
0000340: 7463 6828 272f 5e2e 2a28 3f3a 5c2e 6874  tch('/^.*(?:\.ht
0000350: 6d29 242f 6927 2c20 2466 6e29 2920 7b0a  m)$/i', $fn)) {.
0000360: 2020 2020 2469 7348 746d 6c20 3d20 7472      $isHtml = tr
0000370: 7565 3b0a 7d0a                           ue;.}.

A minimum reproducible example is available at: https://github.com/lol768/lol768/releases/tag/v1

The bottom right status bar tells me the file is being read using UTF-8:

Platform info:

➜  ~  lsb_release -a; uname -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.1 LTS
Release:    14.04
Codename:   trusty
Linux gntpc 3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
➜  ~  locale 
LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
LC_CTYPE=en_GB.UTF-8
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_PAPER="en_GB.UTF-8"
LC_NAME="en_GB.UTF-8"
LC_ADDRESS="en_GB.UTF-8"
LC_TELEPHONE="en_GB.UTF-8"
LC_MEASUREMENT="en_GB.UTF-8"
LC_IDENTIFICATION="en_GB.UTF-8"
LC_ALL=
➜  ~