kschiess / parslet

A small PEG based parser library. See the Hacking page in the Wiki as well.
kschiess.github.com/parslet
MIT License
805 stars 95 forks source link

Best way to match case insensitive strings? #5

Closed PlasticLizard closed 13 years ago

PlasticLizard commented 13 years ago

I am trying to match a set of keywords which are case insensitive, but am having trouble seeing from the various atom implementations how this might be done, as it doesn't appear that flags can be passed into match or string. In past grammars I have had to resort to things like

[Mm][Yy] [Kk][Ee][Yy][Ww][Oo][Rr][Dd]

But I was very much hoping there was something I'm missing that would make that unecessary? :)

floere commented 13 years ago

Hi Nathan

I'm not the original author, but I wanted to remind you that Parslet uses old school Ruby to do stuff.

So you could do it like this: require 'parslet' include Parslet

def ignorecase str
  key_chars = str.split //
  key_chars.collect! do |char|
    other_char = char.match(/[a-z]/) ? char.upcase : char.downcase
    match("[#{char}#{other_char}]")
  end
  key_chars.inject do |result, some_match|
    result ? result >> some_match : some_match
  end
end

# Constructs a parser using a Parser Expression Grammar 
parser = ignorecase('keyword')

result = parser.parse "kEyWoRd"
p result

This is a complete example that you can run. (Not beautiful, but I hope understandable :) )

I'm sure Kaspar has some additional help, but in the meantime…

Cheers Florian

floere commented 13 years ago

P.S: Actually does not make what you describe unnecessary, but packages it inside a Ruby method, btw.

PlasticLizard commented 13 years ago

That's a pretty slick trick - thanks

floere commented 13 years ago

Glad it helped :)

kschiess commented 13 years ago

This would also work, and demonstrates that parslet is really extensible:

require 'parslet'
include Parslet

class CaseInsensitiveStr < Parslet::Atoms::Str
  def try(io) # :nodoc:
    old_pos = io.pos
    s = io.read(str.size).downcase
    error(io, "Premature end of input") unless s && s.size==str.size
    error(io, "Expected #{str.inspect}, but got #{s.inspect}", old_pos) \
      unless s==str
    return s
  end
end

def ignorecase str
  CaseInsensitiveStr.new(str.downcase)
end

# Constructs a parser using a Parser Expression Grammar 
parser = ignorecase('keyword')

result = parser.parse "kEyWoRd"
p result

And I think that try method should read cleaner in future versions, I'm taking that as a suggestion.