louismullie / treat

Natural language processing framework for Ruby.
Other
1.37k stars 127 forks source link

Can you add new worker modules on (or off) the fly? #81

Open dalton opened 10 years ago

dalton commented 10 years ago

I read the manual on how to dynamically extend Treat. Am I correct in identifying this as the Treat::Workers::Groupable#add method? That method appears to define a new class based on a symbol and block and add it to the @@list variable of the Groupable module.

I'm working on a Summarization project and would like to add a summarizers module under workers, but I'm having some trouble figuring out if I need to do anything with the Autoload and Categorizable modules.

Is there a way to extend Treat non-dynamically like this? Similar to a Rails engine perhaps?

louismullie commented 10 years ago

Yes, this is the correct method. No need to add a new module under workers - I would just create the new module under Treat::Workers::Extractor.

louismullie commented 10 years ago

To be clear, you'll need to manually create the module Treat::Workers::Extractors::Summary, and then call the .add method to add a new summarizer.

dalton commented 10 years ago

Are you saying to create the new module inside the gem directory, or can I create do it within my application? If the latter, do I need to require the module before I require Treat so that the Treat initialization picks it up? It seems to depend on the directory structure, so I don't know if that will work. If I require the module after Treat, do I need to call anything it initialze the new Summary module and make it Groupable and Categorizable?

Here is basically what I have so far: https://github.com/dalton/treat_mod

If I do it this way I get

/gems/treat-2.0.7/lib/treat/core/dsl.rb:17:in `method_missing': undefined method `add' for Treat::Workers::Extractors::Summary:Module (NoMethodError)
    from lib/trimmer.rb:7:in `<main>'

If I reverse the order of requires to

require_relative 'treat/workers/extractors/summary'

require 'treat'
include Treat::Core::DSL

I get

gems/treat-2.0.7/lib/treat/workers/categorizable.rb:34: warning: already initialized constant Treat::Workers::Extractors
treat_mod/lib/treat/workers/extractors/summary.rb:3: warning: previous definition of Extractors was here
lib/trimmer.rb:8:in `<main>': uninitialized constant Treat::Workers::Extractors::Summary (NameError)

Thanks for the help, I know I'm just missing something basic here.

dalton commented 10 years ago

For now I edited the gem source directly.

# mkdir lib/treat/workers/extractors/summary
# lib/treat/config/data/workers/extractors.rb
   ...
  summary: {
    type: :transformer,
    targets: [:group]                                                                                                                                                                              
  }

I'll keep looking to see if I can do this in the engine-style, but I was able to add my worker this way

# lib/summarize.rb
require 'treat'
include Treat::Core::DSL

Treat::Workers::Extractors::Summary.add(:hodor) do |sentence, options={}|
  sentence.to_s.split(' ').each do |token|
    sentence << Treat::Entities::Token.from_string('Hodor')
  end
end

x =  sentence('sometimes programming is hard').summary(:hodor) 
puts x.to_s

# ➜   ruby lib/summarize.rb
# Hodor Hodor Hodor Hodor