zenogantner / MyMediaLite

recommender system library for the CLR (.NET)
http://mymedialite.net
502 stars 189 forks source link

support to Koren's neighborhood model and integrated model #373

Closed mmanzato closed 11 years ago

zenogantner commented 11 years ago

Hi Marcelo,

thank you very much for the pull request. It is quite large ;-)

I have several remarks:

  1. Can you squash 83d29e9 and 6a77684? See here: http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html
  2. In those files where you added your code, please add your name to the (c) notice.
  3. Variable names in GSVDPlusPlus:
    • Can you change nAttributes to n_attributes or even better, num_attributes for consistency with the existing code?
    • genre_id could be called attribute_id (more generic)
  4. The empty lines 129 and 130 could be removed in GSVDPlusPlus.cs
  5. You can remove the comments " // TODO better name than x" and "// TODO vectorize".
  6. Can you replace "if(" by "if (" (same with "foreach"), and make sure that curly braces for if-else start in a new line? This is for consistency with the rest of the code, I do not say that one way is better than the other.
  7. Are you sure that ImplicitFeedbackKNN will scale well? In particular the matrices c and w, and the 2d arrays rkiu and nkiu are worrying me. On which dataset have you tested it?
  8. I am not so comfortable with this line in ImplicitFeedbackKNN: "if (this is IntegratedSVDPlusPlusKNN)": It implements child class behavior in the parent class, which can be very confusing to people who read the code.
  9. It is also confusing that ImplicitFeedbackKNN inherits from SVDPlusPlus and has ItemKNN as a member. Wouldn't it be rather that we should have IFKNN inherit from ItemKNN, and then inherit ISVDPPKNN from SVDPlusPlus, and compose ItemKNN into it? Also, is ImplicitFeedbackKNN supposed to be used by itself?

I know, this is a lot of stuff ;-) I think we can make this easier for us if we do separate pull requests for the different recommenders. The best would be 3 distinct pull requests, starting with GSVDPlusPlus. This would make it much easier to discuss the changes.

What do you think?