cadmiumcr / stemmer

MIT License
1 stars 1 forks source link

Missing String monkeypatch #2

Open watzon opened 5 years ago

watzon commented 5 years ago

Looks like the core_ext for String didn't get ported over to this

rmarronnier commented 5 years ago

Exactly. And yet the specs pass fine. Including :

  it "should tokenize and stem with String#tokenize_and_stem" do
    "scoring stinks".tokenize_and_stem.should eq(["score", "stink"])
    "SCORING STINKS".tokenize_and_stem.should eq(["score", "stink"])
  end

That's because the core_ext for String is already included in the tokenizer shard, on which the stemmer depends.

Is there a point to port the same code in all the shards ? :-)

watzon commented 5 years ago

Ahh ok. Tbh I think for things like that we should probably have them included directly in the shard that depends on them. I don't mind having each shard have it's own core_ext because it makes things easier to find.

rmarronnier commented 5 years ago

I totally understand your concern about finding easily the code.

So why not put the core_ext in cadmium_util (because one way or another, all cadmium shards will depend on it) and then point to its location each time we declare a new extension using it, like this :


# Extends cadmium_util/core_ext

module StringExtension
    def stem(stemmer = Cadmium::PorterStemmer)
      stemmer.stem(self)
    end

    def tokenize_and_stem(stemmer = Cadmium::PorterStemmer, keep_stops = false)
      stemmer.tokenize_and_stem(self, keep_stops)
    end
  end

What do you think ?