Closed mthmulders closed 6 years ago
There are a few possibilities here. There could be a regression in 2.2.0 that we missed, your input could be invalid and Jekyll is swallowing the error, or you could be hitting some odd edge case in the LSI.
Are you using the GSL lib? What does your input look like? Do you have any super short posts, like maybe only a line or two? Does Classifier Reborn 2.1.0 work as expected?
Thanks for the suggestions!
Would there be a way to manually try invoke Classifier Reborn on my set of Markdown documents and see if some error occurs?
Would there be a way to manually try invoke Classifier Reborn on my set of Markdown documents and see if some error occurs?
Yes, you can do it from irb. You'll need to require 'classifier-reborn'
first.
Then you can set up a new lsi classifier:
require 'classifier-reborn'
lsi = ClassifierReborn::LSI.new
Then read in each of your markdown files using File.open
. After that feed the markdown into the classifier as shown in the docs. Something like this:
strings = [["This text deals with dogs. Dogs.", :dog],
["This text involves dogs too. Dogs!", :dog],
["This text revolves around cats. Cats.", :cat],
["This text also involves cats. Cats!", :cat],
["This text involves birds. Birds.", :bird]]
strings.each { |x| lsi.add_item x.first, x.last }
You'll be trying to use the find_related
method. If it blows up, post the error here. If it works, then we should get in touch with our friends over at Jekyll.
Thanks again for the suggestions!
I did some experiments with ClassifierReborn::LSI.new
, add_item
and find_related
, while reading raw Markdown files. Since I don't have categories
, I skipped the second argument to add_item
. It seems to work pretty well, giving me documents that related to the text I was looking for. So maybe it's something in the Jekyll / Classifier Reborn integration indeed?
For reference, here is the script that I experimented with
#!/usr/bin/env ruby
require 'classifier-reborn'
lsi = ClassifierReborn::LSI.new
paths = [
"_posts/2013-03-11-ipv6-on-raspbian.md",
"_posts/2017-02-25-blah-blah-microservices-blah-blah.md",
"_posts/2017-06-22-jbcnconf-and-voxxedlu.md",
"_posts/2017-12-30-getting-started-with-zuul.md"
]
paths.each do |path|
puts "Reading file " << path
File.open(path) do |file|
post = ""
file.each do |line|
post << line
end
lsi.add_item(post)
end
end
puts "Finding related stuff"
related = lsi.find_related("In these days of microservices", 1)
puts "Related text:"
puts related
Interesting, I was expecting this to reveal an issue. I guess we need to see what's going on over at Jekyll. @parkr @jekyll/administrators I'll file an issue and see if we can get to the bottom of this.
I don’t believe site.related_post
is a thing, so that makes sense. Please try site.related_posts
instead.
Well, now I feel stupid...!
Indeed, site.related_posts
contains some related documents. Strange thing, though, is that I can see that in the layout for my post, but not in a separate file which populates a side-bar next to the post content. Will dive into that. Thanks a lot!
Glad we got this figured out.
Not sure if this is the right place to ask... Feel free to redirect my question if it doesn't belong here.
I'm using the following:
I'm trying to use LSI to build "related posts", so that I can enrich a document in Jekyll with references to related documents.
However, the
site.related_posts
variable in Jekyll is alwaysnil
. I investigated this by adding the following snippet in my Liquid template:And this snippet always renders to
I don't know how to troubleshoot this any further. Any clues / suggestions?