issues
search
ykdojo
/
personalized_search_challenge
Attempt on a Kaggle competition, Personalized Web Search Challenge, hosted by Yandex (http://www.kaggle.com/c/yandex-personalized-web-search-challenge)
12
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Index files
#55
michaeldubyu
opened
10 years ago
0
S2 goal2
#54
punit-haria
closed
10 years ago
0
Started working on look_at_multiple_sessions. So far, it's just pseudo-code
#53
ykdojo
closed
5 years ago
0
Sending this as a pull request so comments can be made on this
#52
ykdojo
closed
10 years ago
3
Fixed a bug in sample_script: the script did not include the last 1% of the users.
#51
ykdojo
closed
5 years ago
0
Task C3: Locally test the single-skipped method
#50
ykdojo
opened
10 years ago
0
S1 Task A0: Create a parser that parses users instead of sessions.
#49
ykdojo
closed
10 years ago
4
Times skipped
#48
punit-haria
closed
10 years ago
0
Calculate the global means and skipped means using 2^rel instead of plain rel.
#47
ykdojo
opened
10 years ago
0
Task C2: Using C1, reorder ranking -> do this and submit the new result
#46
ykdojo
opened
10 years ago
0
Have global_relevance_mean.py and skipped_relevance_mean.py write their results in CSV files
#45
ykdojo
closed
10 years ago
0
Fixes on strategy2_prediction, including one major bug
#44
ykdojo
closed
10 years ago
0
Ignore .pyc files as they are just Python binaries
#43
ykdojo
closed
10 years ago
0
Added comments in CSV files
#42
ykdojo
closed
10 years ago
0
Strategy2 prediction
#41
punit-haria
closed
10 years ago
0
Make a Google doc to list and explain related papers in summary
#40
ykdojo
opened
10 years ago
3
Script to evaluate, split train=24, test=3, evaluation metric
#39
ykdojo
opened
10 years ago
1
Skipped relevance mean bug fix
#38
punit-haria
closed
10 years ago
0
Task C1 and C2: For each test query that’s applicable for Strategy 2, predict the relevance rate for each document
#37
ykdojo
closed
10 years ago
0
Compares the non-skipped case and skipped case.
#36
ykdojo
closed
10 years ago
0
Unit test lib/dcg.py on the branch locally_test_2
#35
ykdojo
opened
10 years ago
1
comparing global and skipped means
#34
ykdojo
closed
10 years ago
0
Parser version 2
#33
shlfung
opened
10 years ago
1
Do a t-test with the 9th rank and the 10th rank for the skipped cases
#32
ykdojo
closed
10 years ago
0
Reorganized the file structure. We still need to fix paths in most scripts. Someone, please pull this.
#31
ykdojo
closed
10 years ago
0
Change the seed to random (don't set a seed)
#30
ykdojo
closed
10 years ago
0
Look into LambdaMART and LambdaRank
#29
ykdojo
opened
10 years ago
0
S1 Task A1: Find the relevance mean for documents in each rank position, in the case the user has liked the same query-document pair in previous sessions.
#28
ykdojo
opened
10 years ago
0
Organize the file structure
#27
ykdojo
closed
10 years ago
0
Added script that gets the relevance_means of skipped documents. Added script for global url distributions.
#26
punit-haria
closed
10 years ago
1
Index files
#25
michaeldubyu
closed
10 years ago
0
Benchmark performance for lookups using index files
#24
lchsiao
opened
10 years ago
6
Index files
#23
lchsiao
closed
10 years ago
2
Create index files for train dat for faster access
#22
lchsiao
closed
10 years ago
0
Found global relevance mean for each rank position. Plotted them as well.
#21
ykdojo
closed
10 years ago
0
Script to create 1% sample users
#20
lchsiao
closed
10 years ago
1
This is a test pull request, don't pull
#19
lchsiao
closed
10 years ago
1
What happens when a user clicks "next"?
#18
ykdojo
opened
10 years ago
0
Make a team member list with descriptions about their skills, availability, and so on
#17
ykdojo
closed
10 years ago
1
Clean the current scripts we have
#16
ykdojo
closed
10 years ago
0
Task B: Randomly sample users (for instance, 1% of all users) from the original train file and produce a smaller train file. This will be used for quick experiments.
#15
ykdojo
closed
10 years ago
0
Task A.3: Compare A.2 and A.1. Is there a significant difference?
#14
ykdojo
closed
10 years ago
0
Task A.2: Find the relevance mean for documents in each rank position, in the case in which the user has SKIPPED the same document pair in the same session
#13
ykdojo
closed
10 years ago
0
Task A.1: Using the parser, find the global relevance mean for documents in each search rank (position)
#12
ykdojo
closed
10 years ago
0
Make a template for comments
#11
ykdojo
closed
10 years ago
0
Task A.0: Write unittests for the object-oriented parser
#10
ykdojo
opened
10 years ago
0
Cleaned parser.py and created user_parser.py. NOTE: the parser still does not read the last line
#9
ykdojo
closed
10 years ago
0
parser.py does not parse the last line
#8
ykdojo
closed
10 years ago
0
Benchmarked parser.py. It took about 50 minutes to parse the whole train file (on my SSD 8GB MacBook Pro). There are about 34.5 million sessions in total.
#7
ykdojo
closed
10 years ago
2
Added a Python parser that was written by a Kaggler.
#6
ykdojo
closed
10 years ago
0
Next