pickhardt / betty

Friendly English-like interface for your command line. Don't remember a command? Ask Betty.
2.61k stars 215 forks source link

Wrangle in regular expressions #74

Open brysgo opened 10 years ago

brysgo commented 10 years ago

In order to approach #57 in bite sized chunks, a good first step would be pulling in all the regular expression usage into the beginnings of a grammar.

This gem looks like it would be perfect for the task, check it out: http://semr.rubyforge.org/

If you don't feel like clicking a link here is one cool example...

require 'rubygems'
require 'semr'

language = Semr::Language.create do #also accepts a path to a file instead of a block
  concept :number,    any_number, :normalize => as_fixnum
  concept :greeting,  words('hi', 'goodbye', 'hello')
  phrase 'say :greeting :number times' do |greeting, number|
    number.to_i.times { puts greeting }
  end
end

language.parse('say hello 6 times')
# hello
# hello
# hello
# hello
# hello
# hello

language.parse('say goodbye 2 times')
# goodbye
# goodbye
pickhardt commented 10 years ago

Semr looks interesting. I am concerned that it hasn't been updated in 6 years - https://github.com/mdeiters/semr

From what I've observed, projects usually end up building their own tokenizer, but there's no reason for us to do that yet if a library does 80% or more of what we'd want.

pickhardt commented 10 years ago

I'd consider semr but do you know of any other good grammar projects?

Another approach is building a simple tokenizer ourselves.

One idea is to pass around an array<string|classes> as the list of commands to interpret. Rather than "copy all files ending in rb to my projects directory" it'd something like be:

  1. ["copy all files ending in rb to my projects directory"]
  2. ["copy all files ending in rb to ", Directory instance]
  3. ["copy", Files instance, "to", Directory instance]
  4. [Copy command, Files instance, "to", Directory instance]
  5. [Copy command, Files instance, Preposition instance, Directory instance]
  6. execute!

Where we create classes for Directory, File, Preposition, and Command.

brysgo commented 10 years ago

That sounds good. I would definitely be up for building a tokenizer, as long as it scales well I think it could be great.

I think being able to have broad tokens that could be made progressively more specific by drawing from context would be super cool.

brysgo commented 10 years ago
class Copy < Token
  argument source: FileInstance
  argument destination: DirectoryInstance,
    question: 'Where would you like to copy it to?'
  statement 'copy', :source, 'to', :destination
  statement 'copy', :source,
    ask: :destination

  def call
    BashCommand.call('cp', source.call, destination.call)
  end
end

Does this seem like a reasonable sketch of what copy might look like?

pickhardt commented 10 years ago

Yes, that makes a lot of sense.

One thing is that 'copy' and 'to' should actually be more general, maybe a regex instead of a string, because I could imagine wanting to say "duplicate source to my home directory".

brysgo commented 10 years ago

:+1: Okay, I will start implementing this refactor at some point this week and throw it on a branch.

brysgo commented 10 years ago

How would you feel about doing something like this?:

  argument copy: WordExpansion
  statement :copy, :source, 'to', :destination

Meanwhile, in WordExpansion land...

class WordExpansion < Token
  statement do |options|
    search_text = options[:search_text]
    argument_name = options[:argument_name]
    expansions = Thesaurus.expand(argument_name)
    expansions.each do |expansion|
      if result = pop_text(search_text, expansion)  
        return result
      end
    end
    nil
  end

private

def pop_text(search_text, keyword)
  split_text = search_text.split(keyword)
  if split_text.length > 1 && split_text[0] == ''
    split_text[1..-1].join(keyword)
  end
  nil
end
brysgo commented 10 years ago

Never mind on the above for now.

I started implementing it and realized that we probably want to start by just making all the arguments a SimpleMatcher type to get the api fleshed out. After that we can find common argument types and factor them out.

pickhardt commented 10 years ago

OK. I'm interested in what approach you are taking, so feel free to start a pull request early while you're in the process of developing it.

brysgo commented 10 years ago

Right now I am moving the find command over to the style we talked about above. I will submit a pull when I have it in some working order that doesn't look to horrendous.

I don't usually do too much ruby meta programming so I've been playing with the Token class trying to get it to work with the same api as above.

brysgo commented 10 years ago

I didn't forget about this...

After my experiment with the pull request above I decided it might be easier to go with semr after all.

Check out mdeiters/semr#1