kbalog / uis-dat640-fall2020

Information Retrieval and Text Mining course at the University of Stavanger (DAT640), 2020 fall
8 stars 9 forks source link

Query_processing_DaaT score function #2

Open ChristofferHolmesland opened 4 years ago

ChristofferHolmesland commented 4 years ago

What is the correct method for calculating the score in the Query_processing_DaaT exercise? Using the score function from the notebook on document 0 I get score = 1/3. The test expects 0.0526.


Doc0 = "duck duck duck"  
Query = "beijing duck recipe"  
  c_t,d / |d| * c_t,q / |q|
  0 / 3 * 1 / 3 # beijing in doc0  
+ 3 / 3 * 1 / 3 # duck in doc0  
+ 0 / 3 * 1 / 3 # recipe in doc0
= 1/3
BerntA commented 4 years ago

The solution uses len(query) which is the length of the string, (=19), but it should be len(query.split()) (=3), right? If you divide on len(query) you get 0.0526..

FebriantiW commented 4 years ago

but why should it be devided with len(query) and not terms ? I dont understand.. while the relevance is counted by terms and not character.

BerntA commented 4 years ago

I think it is a bug, it should be divided by the amount of terms, hopefully the TA or professor can confirm this!