"This is the biggest new requirement. Your main program must operate in two modes:
Boolean query mode, and ranked query mode.
In ranked query mode, you must process a query without any Boolean operators and return the top K = 10 documents satisfying the query. Use the 'term at a time' algorithm as discussed in class:
For each term t in the query:
(a) Calculate wq;t = ln (1 + N/dft)
(b) For each document d in t's postings list:
i. Acquire an accumulator value Ad (the design of this system is up to you).
ii. Calculate wd;t = 1 + ln (tft;d).
iii. Increase Ad by wd;t × wq;t.
For each non-zero Ad, divide Ad by Ld, where Ld is read from the docWeights.bin file.
Select and return the top K = 10 documents by largest Ad value. (Use a binary heap priority queue to select the largest results; do not sort the accumulators.)
Use 8-byte floating point numbers for all the calculations.
(print ranked retrieval results: Please print the title of each document returned from a ranked retrieval, as well as the final accumulator value for that document.)"
"This is the biggest new requirement. Your main program must operate in two modes: Boolean query mode, and ranked query mode. In ranked query mode, you must process a query without any Boolean operators and return the top K = 10 documents satisfying the query. Use the 'term at a time' algorithm as discussed in class:
Use 8-byte floating point numbers for all the calculations.
(print ranked retrieval results: Please print the title of each document returned from a ranked retrieval, as well as the final accumulator value for that document.)"