atom / language-ruby

Ruby package for Atom
Other
101 stars 144 forks source link

Regexp with some kinds of interpolation get transformed into comments. #82

Open envygeeks opened 9 years ago

envygeeks commented 9 years ago

Full example:

module EnvyGeeks
  module Get extend self
    def ips(cidr = "192.168.1.0/24")
      # Technically we don't need the select but we should defensively double chck.
      `ip route show #{cidr}`.each_line.select { |v| v =~ /scope link/ }.map do |v|
        IPAddr.new(v.match(/(?:src )(#{Regexp.escape(cidr.gsub(/\/\d+\Z/, "").gsub(/\.\d+\Z/, ""))}\.\d+)/)[1])
      end
    end
  end
end
nokutu commented 9 years ago

For me this is highlighted as a regular expression:

  /\/\d+\Z/

Sorry, but I don't know how to copy with colors, but it's the same as any other regular expression.

lee-dohm commented 9 years ago

Reproduced using Atom v0.193.0-b1d005f on Mac OS X 10.10.3

screen shot 2015-04-20 at 8 04 28 am

nokutu commented 9 years ago

I see, my fault.

mamenama commented 8 years ago

Ugh. I was playing with this for a while to no avail. I added commit 6a5a77c909fde7bd826163f711126c83d2046b3d to my local fork that adds a test for this behavior (which obviously fails). Since it fails, I won't do a PR but someone with a better feel for what's wrong can feel free to grab them and use them to fix. I am at a loss. :disappointed:

For posterity, here is a reduced version:

image

In TextMate it seems like the whole thing is mostly treated like a regex, AFAICT: image

envygeeks commented 8 years ago

Yeah it's one crazy tricky problem that's why I never tried to fix it, I attempted as well and was just like "I can't solve this easily right now I'll come back when I have time since it's a rare problem."

mamenama commented 8 years ago

Yeah, part of the problem I ran into is the layout and regexes in the ruby.cson file. It looks like it was a CSON conversion of a JSON conversion of a TextMate file (obviously a heroic feat!) but my eyes glaze over regexes like '(?<![\\w)])((/))(?![*+?])(?=(?:\\\\/|[^/])*/[eimnosux]*\\s*([\\]#).,?:}]|$|\\|\\||&&|<=>|=>|==|=~|!~|!=|;|if|else|elsif|then|do|end|unless|while|until|or|and))'

I think the original TextMate file is better commented (I can see it in TM but can't find it online, but here is the Sublime Text version with the "cleaner" regex style. At a glance, however, it seems that the Atom one has diverged from these other two so it's not as simple as just porting it over. Again, someone with more knowledge might be able to make heads or tails of it.

infininight commented 8 years ago

So the issue with recognizing the start of a regular expression is distinguishing between it and the / operator. The rule for TextMate and Atom take a similar approach by looking and the surrounding code to try to ascertain the context.

The rule in TextMate looks at the code preceding the opening slash to try to rule out false positives. Besides looking directly after the slash a single character it does not look forward at all.

The rule in Atom however starts with the slash and then tries to look at what follows the expression string. This is where the problem is, the expression to match the content of the regular expression string is very naive simply looking for escaped slashes or any character except slashes. The problem is that a slash can appear unescaped inside embedded ruby code or even a simple character class as in this also breaking example:

/test[/f]/

It's a complicated issue we've been trying to perfect for many years now, not sure which is the better approach. For the Atom approach to work though it needs to flawlessly match the contents of any regular expression to know when it is at the end of one.

hediyi commented 8 years ago

I think both approaches have certain limits (or flaws). From what I know, language-ruby's approach

While ruby.tmbundle's approach

@50Wliu Rather than to say which is better, I say language-ruby's problem is slightly bigger.

winstliu commented 8 years ago

When are unescaped slashes allowed? If they're only allowed in character classes I might have an idea.

hediyi commented 8 years ago

Regexps allow Interpolations, in which you can use / unescaped.

But people rarely create regexps like that, so we're ready to hear your great idea. 😄

winstliu commented 8 years ago

I'm assuming if we move the division operator above the regexp block that'll break regexes entirely, correct?

hediyi commented 8 years ago

Indeed.

AlexWayfer commented 7 years ago

My case:

DB_DUMP_REGEXP = /^
  #{db_config[:database]}_#{DB_DUMP_TIMESTAMP_REGEXP}
  #{Regexp.escape(DB_DUMP_EXTENSION)}
$/xo

image