jekyll / classifier-reborn

A general classifier module to allow Bayesian and other types of classifications. A fork of cardmagic/classifier.
https://jekyll.github.io/classifier-reborn/
GNU Lesser General Public License v2.1
554 stars 110 forks source link

Classifier-Reborn crashes when summarizing one-sentence articles #152

Closed tra38 closed 7 years ago

tra38 commented 7 years ago

I have been using classifier-reborn to come up with headlines for computer-generated articles. The problem is that I don't get to decide how many sentences exist within a computer-generated article, so you have this weird edge case...

require 'classifier-reborn'

computer_generated_article = "Then there's The X Days of Christmas, a songbook with 165 verses, such as 'One hundred and sixty-one oblati in crackpottery Santas', and 'One hundred and twenty-eight acephala ventriloquizing'.\n"

headline = ClassifierReborn::Summarizer.summary(computer_generated_article, 1)

NoMethodError: undefined method `col' for nil:NilClass
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi/content_node.rb:30:in `transposed_search_vector'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:190:in `block in proximity_array_for_content'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:188:in `collect'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:188:in `proximity_array_for_content'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:166:in `block in highest_relative_content'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:166:in `each_key'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:166:in `highest_relative_content'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi/summarizer.rb:29:in `perform_lsi'
    from /Users/tariqali/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi/summarizer.rb:10:in `summary'
    from (irb):3
    from /Users/tariqali/.rbenv/versions/2.2.2/bin/irb:11:in `<main>'
Ch4s3 commented 7 years ago

Yeah, the LSI isn't really setup for 1 sentence input. We can dig in at some point, but it might make sense to check the input first.

Ch4s3 commented 7 years ago

@tra38 I'm closing due to inactivity. Feel free to ping me if there's work to be done here.

tra38 commented 6 years ago

Hi @Ch4s3, I chosen to handle the issue on my end by just catching the error and generating an empty summary instead.