ykdojo personalized_search_challenge issues

ykdojo / personalized_search_challenge

Attempt on a Kaggle competition, Personalized Web Search Challenge, hosted by Yandex (http://www.kaggle.com/c/yandex-personalized-web-search-challenge)

12 stars 4 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Index files

#55 michaeldubyu opened 10 years ago
0
S2 goal2

#54 punit-haria closed 10 years ago
0
Started working on look_at_multiple_sessions. So far, it's just pseudo-code

#53 ykdojo closed 5 years ago
0
Sending this as a pull request so comments can be made on this

#52 ykdojo closed 10 years ago
3
Fixed a bug in sample_script: the script did not include the last 1% of the users.

#51 ykdojo closed 5 years ago
0
Task C3: Locally test the single-skipped method

#50 ykdojo opened 10 years ago
0
S1 Task A0: Create a parser that parses users instead of sessions.

#49 ykdojo closed 10 years ago
4
Times skipped

#48 punit-haria closed 10 years ago
0
Calculate the global means and skipped means using 2^rel instead of plain rel.

#47 ykdojo opened 10 years ago
0
Task C2: Using C1, reorder ranking -> do this and submit the new result

#46 ykdojo opened 10 years ago
0
Have global_relevance_mean.py and skipped_relevance_mean.py write their results in CSV files

#45 ykdojo closed 10 years ago
0
Fixes on strategy2_prediction, including one major bug

#44 ykdojo closed 10 years ago
0
Ignore .pyc files as they are just Python binaries

#43 ykdojo closed 10 years ago
0
Added comments in CSV files

#42 ykdojo closed 10 years ago
0
Strategy2 prediction

#41 punit-haria closed 10 years ago
0
Make a Google doc to list and explain related papers in summary

#40 ykdojo opened 10 years ago
3
Script to evaluate, split train=24, test=3, evaluation metric

#39 ykdojo opened 10 years ago
1
Skipped relevance mean bug fix

#38 punit-haria closed 10 years ago
0
Task C1 and C2: For each test query that’s applicable for Strategy 2, predict the relevance rate for each document

#37 ykdojo closed 10 years ago
0
Compares the non-skipped case and skipped case.

#36 ykdojo closed 10 years ago
0
Unit test lib/dcg.py on the branch locally_test_2

#35 ykdojo opened 10 years ago
1
comparing global and skipped means

#34 ykdojo closed 10 years ago
0
Parser version 2

#33 shlfung opened 10 years ago
1
Do a t-test with the 9th rank and the 10th rank for the skipped cases

#32 ykdojo closed 10 years ago
0
Reorganized the file structure. We still need to fix paths in most scripts. Someone, please pull this.

#31 ykdojo closed 10 years ago
0
Change the seed to random (don't set a seed)

#30 ykdojo closed 10 years ago
0
Look into LambdaMART and LambdaRank

#29 ykdojo opened 10 years ago
0
S1 Task A1: Find the relevance mean for documents in each rank position, in the case the user has liked the same query-document pair in previous sessions.

#28 ykdojo opened 10 years ago
0
Organize the file structure

#27 ykdojo closed 10 years ago
0
Added script that gets the relevance_means of skipped documents. Added script for global url distributions.

#26 punit-haria closed 10 years ago
1
Index files

#25 michaeldubyu closed 10 years ago
0
Benchmark performance for lookups using index files

#24 lchsiao opened 10 years ago
6
Index files

#23 lchsiao closed 10 years ago
2
Create index files for train dat for faster access

#22 lchsiao closed 10 years ago
0
Found global relevance mean for each rank position. Plotted them as well.

#21 ykdojo closed 10 years ago
0
Script to create 1% sample users

#20 lchsiao closed 10 years ago
1
This is a test pull request, don't pull

#19 lchsiao closed 10 years ago
1
What happens when a user clicks "next"?

#18 ykdojo opened 10 years ago
0
Make a team member list with descriptions about their skills, availability, and so on

#17 ykdojo closed 10 years ago
1
Clean the current scripts we have

#16 ykdojo closed 10 years ago
0
Task B: Randomly sample users (for instance, 1% of all users) from the original train file and produce a smaller train file. This will be used for quick experiments.

#15 ykdojo closed 10 years ago
0
Task A.3: Compare A.2 and A.1. Is there a significant difference?

#14 ykdojo closed 10 years ago
0
Task A.2: Find the relevance mean for documents in each rank position, in the case in which the user has SKIPPED the same document pair in the same session

#13 ykdojo closed 10 years ago
0
Task A.1: Using the parser, find the global relevance mean for documents in each search rank (position)

#12 ykdojo closed 10 years ago
0
Make a template for comments

#11 ykdojo closed 10 years ago
0
Task A.0: Write unittests for the object-oriented parser

#10 ykdojo opened 10 years ago
0
Cleaned parser.py and created user_parser.py. NOTE: the parser still does not read the last line

#9 ykdojo closed 10 years ago
0
parser.py does not parse the last line

#8 ykdojo closed 10 years ago
0
Benchmarked parser.py. It took about 50 minutes to parse the whole train file (on my SSD 8GB MacBook Pro). There are about 34.5 million sessions in total.

#7 ykdojo closed 10 years ago
2
Added a Python parser that was written by a Kaggler.

#6 ykdojo closed 10 years ago
0