whoosh-community / whoosh

Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.
Other
240 stars 36 forks source link

Language Modal through whoosh in Information Retrieval #474

Open fortable1999 opened 6 years ago

fortable1999 commented 6 years ago

Original report by Anonymous.


I am working in Information Retrieval.

Can any one Guide me, How can I implement the Language modal in whoosh. I already Applied TD-IDF and BM25. I am new to IR. All your help will be appreciated.

For an example,

The simplest form of language model simply throws away all conditioning context, and estimates each term independently. Such a model is called a unigram language model :

P_{uni}(t_1t_2t_3t_4) = P(t_1)P(t_2)P(t_3)P(t_4)

There are many more complex kinds of language models, such as bigram language models , which condition on the previous term,

P_{bi}(t_1t_2t_3t_4) = P(t_1)P(t_2\vert t_1)P(t_3\vert t_2)P(t_4\vert t_3)

stevennic commented 4 years ago

I answered this question on Stackoverflow where it was also posted.