ondra-m / ruby-spark

Ruby wrapper for Apache Spark
MIT License
227 stars 29 forks source link

Passing function #32

Open librarywebchic opened 8 years ago

librarywebchic commented 8 years ago

I'm trying to understand what is described here in the wiki https://github.com/ondra-m/ruby-spark/wiki/Passing-function

This seems to described an issue I'm seeing where I have a method I want to use but it isn't recognized when I call it within .map. I can't figure out a possible solution from the documentation.

Something like this doesn't work

def manipulate_data line + " more stuff"
end

api_keys = requests_with_key.map(lambda {|line| [manipulate_data(line), 1]})

But I am unsure how I would make it work. Is it as simple as adding requests_with_key.bind(method(:manipulate_data))

ondra-m commented 8 years ago

You cannot pass more methods into map. Use:

requests_with_key.map(lambda {|line| [line + " more stuff", 1]})

Other options is use a different mapping:

func = lambda {|part| 
  def manipulate_data(line)
    line + " more stuff"
  end

  part.map do |line|
    [manipulate_data(line), 1]
  end
}
api_keys = requests_with_key.map_partitions(func)