ykdojo / personalized_search_challenge

Attempt on a Kaggle competition, Personalized Web Search Challenge, hosted by Yandex (http://www.kaggle.com/c/yandex-personalized-web-search-challenge)
12 stars 4 forks source link

S1 Task A0: Create a parser that parses users instead of sessions. #49

Closed ykdojo closed 10 years ago

michaeldubyu commented 10 years ago

We now have the file user_parser_class.py within libs/ at branch index_files which will sequentially read a training file, find unique user ids, and then use the session lookup hash to find where the lines reside for the session. It then passes this information to the session_parser.py file, where the static method from_rows will create a Session object for every session a user is associated with. See the test_user_parser.py for details.

michaeldubyu commented 10 years ago

Ok, this should be closeable. See #52 .

ykdojo commented 10 years ago

How did you test this?

On Sun, Dec 22, 2013 at 3:30 AM, Michael Wu notifications@github.comwrote:

Ok, this should be closeable. See #52https://github.com/yosukesugishita/personalized_search_challenge/pull/52.

— Reply to this email directly or view it on GitHubhttps://github.com/yosukesugishita/personalized_search_challenge/issues/49#issuecomment-31083668 .

michaeldubyu commented 10 years ago

If one calls User.parse_from_file(filepath), the parser will produce user objects. I tested this by verifying the output on a smaller sample of the training set in comparing the output of the script and the users/sessions from the sample.