pharo-ai / NgramModel

Ngram language model implemented in Pharo
MIT License
4 stars 4 forks source link

#ngramCounts method should be added to NgramModel #1

Closed myroslavarm closed 4 years ago

myroslavarm commented 4 years ago

I think it's nice that we have everything wrapped inside a single ngram object but some missing functionality makes it difficult to get to some of the properties from the outside, such as accessing the ngram count info.

olekscode commented 4 years ago

Hello @myroslavarm, Could you provide a use case scenario in which you need to access the table of n-gram counts? Because this table is hidden by design: it is an implementation detail of the NgramModel which should not be accessible through public API

myroslavarm commented 4 years ago

The use case i had in mind was to be able to access the table in order to cut off some of the results (for example, if you want to leave only the results with frequencies above a certain threshold). If this is by design and you don't want people to modify the tables for their own use, then maybe there should be some implementation from within the ngram model that would handle cases like this.

olekscode commented 4 years ago

OK. You are right! But the private properties should not be accessible from outside. The way I see it (I might be wrong), it is the responsibility of the NgramModel to perform the filtering after (or as part of) the training.

So you have two options:

  1. Create a subclass of NgramModel and add the filtering functionality to this class
  2. Update the current implementation of NgramModel and make a PR.

I think you should do the latter because indeed the filtering option could be very useful for many applications of the NgramModel.