aurelian / ruby-stemmer

Expose libstemmer_c to Ruby
http://locknet.ro/archive/2009-10-29-ann-ruby-stemmer.html
MIT License
251 stars 22 forks source link
c ruby ruby-extension rubynpl stemmer

= Notice @aurelian May 2022

👋 This project started in 2008 mostly as a mean for me to learn how to build C extensions to ruby, exposing a library at that time I needed to use in a real life project. It's 2022 and many things changed since. Most important is my lack of time to keep up with recent libstemmer_c versions and releasing builds compatible with various versions of Windows.

With this in mind, it is fair to archive this project.

= Ruby Stemmer

Ruby-Stemmer exposes SnowBall API to Ruby.

{Travis CI Status}[https://api.travis-ci.org/aurelian/ruby-stemmer.png]

This package includes libstemmer_c library released under BSD licence and available for free {here}[https://snowballstem.org/download.html].

Support for latin language is also included and it has been generated with the snowball compiler using {schinke contribution}[https://snowballstem.org/otherapps/schinke/].

For more details about libstemmer_c please visit the {SnowBall website}[https://snowballstem.org/].

== Usage

require 'rubygems' require 'lingua/stemmer'

stemmer= Lingua::Stemmer.new(:language => "ro") stemmer.stem("netăgăduit") #=> netăgădu

=== Alternative

require 'rubygems' require 'lingua/stemmer'

Lingua.stemmer( %w(incontestabil neîndoielnic), :language => "ro" ) #=> ["incontest", "neîndoieln"] Lingua.stemmer("installation") #=> "instal" Lingua.stemmer("installation", :language => "fr", :encoding => "ISO_8859_1") do | word | puts "~> #{word}" #=> "instal" end # => #

=== Gemfile

gem 'ruby-stemmer', '>=2.0.0', :require => 'lingua/stemmer'

=== More details

== Install

gem install ruby-stemmer

==== Windows

There's also a Windows (Fat bin)

gem install ruby-stemmer --platform=x86-mingw32

As far as I know the above should work with {rubyinstaller}[http://rubyinstaller.org/]. If it fails, you could try with:

gem install ruby-stemmer --platform=x86-mswin32

{It's known}[https://cl.ly/BX9o] to work under Windows XP.

=== Development version

$ git clone git://github.com/aurelian/ruby-stemmer.git $ cd ruby-stemmer $ rake -T #<== see what we've got $ rake compile #<== builds the extension do'h $ rake test

==== Cross Compiling

Install {rake-compiler-dock}[https://github.com/rake-compiler/rake-compiler-dock] and follow the setup.

Then, inside the docker image:

$ AR=i686-w64-mingw32-ar CC=i686-w64-mingw32-gcc LD=i686-w64-mingw32-ld rake cross native gem

Or, build the lib first then compile:

$ cd libstemmer_c $ AR=i686-w64-mingw33-ar CC=i686-w64-mingw32-gcc LD=i686-w64-mingw32-ld make $ cd ../ $ rake cross native gem

== NOT A BUG

The stemming process is an algorithm to allow one to find the stem of an word (not the root of it). For further reference on stem vs. root, please check wikipedia articles on the topic:

== TODO

== Note on Patches/Pull Requests

== Alternative Stemmers for Ruby

== Copyright

Copyright (c) 2008-2020 {Aurelian Oancea}[http://locknet.ro]. See MIT-LICENSE for details.

== Contributors

encoding: utf-8