Closed punit-haria closed 10 years ago
I'll merge this for now, but can you make the changes that I suggested and send another pull request?
For the next step, can you compare the global means and skipped means? Currently, global means include skipped means inside, so we will need to get rid of that contribution to compare them better. We can do it just by using the final sums and lengths, as in:
sum = global_sum_rank_2 - skipped_sum_rank_2 length = global_length_rank_2 - skipped_length_rank_2 mean_rank_2 = sum / length
skipped_relevance_mean.py --> gets the relevance means of documents skipped one or more times (for each position between 1 and 10)
global_url_distribution.py --> plots the count of urls repeated once, twice, ..etc (globally)
session_parser.py --> added functions that return skipped documents
Sorry about the reformatting, my editor was being really picky.