paulgoetze / weka-jruby

Machine Learning & Data Mining with JRuby
MIT License
65 stars 8 forks source link

add new convenient methods #17

Closed kcning closed 7 years ago

kcning commented 7 years ago
  1. the object returned by Instances.from_arff is preferred to be Weka::Core::Instances rather than Java::WekaCore::Instances. The Java class doesn't have the convenient methods.

  2. we should be able to retrieve the attribute values, not just the internal floating point values from an Instance directly.

paulgoetze commented 7 years ago

Regarding 1.:

When loading a dataset with Instances.from_arff you actually get a Ruby object with all access to the convenient methods:

dataset = Weka::Core::Instances.from_arff('weather.arff')
# => #<Java::WekaCore::Instances:0x4fd05028>
dataset.attributes # which is not a method from the Java class
# => => [#<Java::WekaCore::Attribute:0x2c88a3e8>, ...]

Although it says Java::WekaCore::... it is actually what you need. I guess it’s because the Ruby classes are imported and reopened Java classes.

Regarding 2:

Do you want to access attribute values of an Instance or all values (as a matrix/2d array) of Instances?

For an Instance there is the DenseInstance#values method. Thus, for an Instances object you could use dataset.instances.map(&:values).

So, if you where thinking about sth. like an Instances#to_matrix method, this might be handy indeed.

paulgoetze commented 7 years ago

@kcning I added an additional Instances#to_m witch returns a Ruby Matrix of the instances's values.

This will be available in the next release (v0.5.0). Thus, I'll close this issue. Feel free to comment or reopen if you feel there is sth. missing for this issue's resolution.