paulgoetze / weka-jruby

Machine Learning & Data Mining with JRuby
MIT License
65 stars 8 forks source link

Weka::UnassignedTrainingInstancesError #25

Closed samuraraujo closed 6 years ago

samuraraujo commented 6 years ago

Hi all, I have problem to deserialize a model.

Any idea what could be the solution.

@classifier = Weka::Core::SerializationHelper.deserialize('./randomforest.model') if File.exist?('./randomforest.model') instances = Weka::Core::Instances.from_csv('./datapoint.txt') instances.class_attribute = :gain @classifier.classify values

Weka::UnassignedTrainingInstancesError: Classifier is not trained with Instances. You can set the training instances with #train_with_instances. ensure_trained_with_instances! at /home/samur/.rvm/gems/jruby-9.0.5.0/gems/weka-0.5.0-java/lib/weka/classifiers/utils.rb:34 classify at /home/samur/.rvm/gems/jruby-9.0.5.0/gems/weka-0.5.0-java/lib/weka/classifiers/utils.rb:104 predict at classifier.rb:118

at classifier.rb:178
paulgoetze commented 6 years ago

Hey @samuraraujo, this is the issue described in #10. Unfortunately, I still haven’t had time to sort it out properly.

If you look into the explaining example code given in the zip-attachment in this https://github.com/paulgoetze/weka-jruby/issues/10#issuecomment-264727674 you should get an idea on how to work around it for now.

And since this is actually a crucial bug for using the gem in the wild, I'll do my best to provide the overdue fix for it until the end of the month – promised :)

Also let me know if you need any further help!

samuraraujo commented 6 years ago

Hi @paulgoetze, it would be really great you could fix it.

I did not manage to make it work.

paulgoetze commented 6 years ago

@samuraraujo on it, and making quite some progress :) I will hopefully publish a new release until early next week.

samuraraujo commented 6 years ago

Nice! Thank you a lot. I appreciate it.

paulgoetze commented 6 years ago

@samuraraujo I published a new version of the gem, which also includes the fix for your issue.

Please note, that you need some instances_structure info on the deserialized classifier in order to run your code as above. This might be provided by a <your-model-filename>.structure file (which is just the serialized training_data.string_free_header, i.e. an Instances object with just the attributes and class-attribute info, but without instance items) – for the just published gem version this is automatically created and picked up when serializing/deserializing a classifier. So, nothing needed on your side, but serializing/deserializing your trained classifier again – and your code should run without changing anything else.

In case you used a classifier model that was not serialized by you, you can add the instances_structure to your deserialized classifier before running the #classify method, so this should work:

@classifier = Weka::Core::SerializationHelper.deserialize('./randomforest.model') if File.exist ('./randomforest.model')
instances = Weka::Core::Instances.from_csv('./datapoint.txt')
instances.class_attribute = :gain

# this adds the info about the instance structure to your classifier:
@classifier.instances_structure = instances

@classifier.classify values

Please let me know whether this works for you.

samuraraujo commented 6 years ago

Thank you Paul! It worked this time for me. Thank you for the effort on updating it. Best regards.