Open henrebotha opened 8 years ago
There are some known issues with LSI. Are you using GNU GSL or the native Ruby version? If you're using the native ruby version, it relies on a buggy Ruby implementation of a matrix transform (discussed here #30) and throws this type of error for some input. If that's the case, using GNU GSL will fix this. If you're using GNU GSL, this will require some digging.
You're fast! I am using the native Ruby version. I'll hit up GNU GSL and see what happens.
If I were you I'd mention this in the Readme.
I happened to be on the issues. Yeah, let me know how GNU GSL works out. I need to rewrite the SVD, but I'm not a great C programmer so the process has been slow to say the least. If you're trying to train with small inputs especially ones that use abbreviations, the matrix transform is highly likely to break in the Ruby only version.
While I have you, I'm getting this:
GSL::ERROR::EUNIMPL: Ruby/GSL error code 24, svd of MxN matrix, M<N, is not implemented (file svd.c, line 61), the requested feature is not (yet) implemented
from /Users/leaply/.rbenv/versions/2.2.4/lib/ruby/gems/2.2.0/bundler/gems/classifier-reborn-4e3bb14d6388/lib/classifier-reborn/lsi.rb:292:in `SV_decomp'
Hum, could be related to this https://github.com/SciRuby/rb-gsl/issues/21. I'm investigating.
Which version of GSL did you pull down?
1.16 via homebrew
1.16 might work, let me try to pull down fresh versions later and try locally.
I haven't gotten anywhere with this, can anyone else reproduce this?
@henrebotha can you try with the latest master to see if #77 raises an error on your input?
That's gonna take some doing. I'll try when I have access to a Mac.
@henrebotha have you tried this yet?
I intend to close this if there's no more action in the next few days.
@Ch4s3 @henrebotha I'm seeing the same issue with my data and can reproduce with this script:
require 'classifier-reborn'
lsi = ClassifierReborn::LSI.new
# Without gsl this raises NoMethodError
# /classifier-reborn-2.0.4/lib/classifier-reborn/lsi.rb:143:
# in `block in build_index': undefined method `normalize' for nil:NilClass
# With gsl this raises GSL::ERROR::EUNIMPL
# /classifier-reborn-2.0.4/lib/classifier-reborn/lsi.rb:292:in `SV_decomp':
# Ruby/GSL error code 24, svd of MxN matrix, M<N, is not implemented (file svd.c, line 60),
# the requested feature is not (yet) implemented
lsi.add_item 'England', 'xx'
lsi.add_item 'England & Wales', 'xx'
lsi.add_item 'England And Wales', 'xx'
Using GNU GSL, tried upgrading from 2.2.1 to 2.3 and that didn't fix it.
Related to this TODO in lsi.rb?
Any ideas on this? I'm seeing the Ruby/GSL-derived exception in SV_decomp
whenever I try to build an index on more than around 2,000 sentences. I have 4,007 sentences I'd like to index. For those 2000 the classifier works great for my purpose, so I'm really eager to find a way to get this working properly, if possible...
(to be fair, it probably has nothing to do with how many sentences I have and more to do with some sentence entering the index beyond 2000 that is causing a problem like seen in other comments above...)
@mepatterson I'd guess you have some malformed input. Can you throw a begin rescue
around your training and see which doc/line blows it up?
@timcraft I know this sounds stupid, but have you double checked that you're actually using GNU GSL? It may not have loaded correctly.
Actually I can confirm @timcraft repro also. Just those three add item lines will cause the gsl crash every time on my machine using very latest gsl and rb-gsl
On Fri, Mar 10, 2017 at 1:23 PM Chase Gilliam notifications@github.com wrote:
@mepatterson https://github.com/mepatterson I'd guess you have some malformed input. Can you throw a begin rescue around your training and see which doc/line blows it up?
@timcraft https://github.com/timcraft I know this sounds stupid, but have you double checked that you're actually using GNU GSL? It may not have loaded correctly.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jekyll/classifier-reborn/issues/69#issuecomment-285760979, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEh33B7cKesfX1ZYXHOqJb4g4Adjomfks5rkaNFgaJpZM4IjPG5 .
Ok, I'll try to dig in this weekend.
@Ch4s3 Yep, it appears to be loaded ok. I added this at the top of the script (matrix code from gsl-2.1.0.2/examples/linalg/SV.rb which uses SV_decomp):
puts "Using GSL/#{GSL::VERSION} RubyGSL/#{GSL::RUBY_GSL_VERSION}"
a = GSL::Matrix[[3, 5, 2], [6, 2, 1], [4, 7, 3]]
u, v, s = a.SV_decomp
p u*GSL::Matrix.diagonal(s)*v.trans
Output is Using GSL/2.3 RubyGSL/2.1.0.2
, and the correct matrix.
Same here.
I have GSL installed but it's not even loaded
@elisaado can you post any details?
I'm on Ruby 2.2.4. I'm trying to use LSI. Nothing works, and the error messages SUCK. I've tried both the last release (i.e. the gem version) and the latest commit from Github.
Better yet, if I swap the order of the training data, I get this: