Closed fanyike closed 6 years ago
I have the same question and may I ask which one is the main part of the algorithm? Should I perform all the python profile? So sorry to raise such questions but I never deal with python before...
@fanyike @SeverusBing Thanks for your interests in this work. The readme file is updated. You may check it for obtaining Yelp-50K and Amazon-50K.
Feel free to post here if you have any questions.
Thank you so much for your reply ! I just try to open the website you mentioned in the readme file, but it seems to be unavailable...
@SeverusBing What do you mean by "unavailable"? It's a dropbox link, which you can download the dataset. In fact, I ask several people to test and the link is OK.
Sorry! Maybe there was something wrong with my internet yesterday, now I get the data package, thank you!
I'm sorry to disturb you again...I put the data folder in the project directory and performed the command as the example "python run_exp.py config/yelp-50k.yaml -reg 0.5", but I got a FileNotFoundError: No such file or directory: 'D:\FMG-master\log\fmg_yelp-50k_vary_reg_split1.log', I checked the code of run_exp.py and found the set_logfile function may be related to this error, including the sentences:
logfilename = 'log/fmg%s_%s_split%s.log' % (config['dt'], config['exp_type'], config['sn']) if config['exp_type'] == 'vary_mg': logfilename = 'log/fmg%s_%s_split%s_reg%s.log' % (config['dt'], config['exp_type'], config['sn'], config['reg'])"
May I ask what this function is for? And what is a 'fmg_yelp-50k_vary_reg_split1.log', is it a data file? Should I rename these files in the data folder as the form "fmg_yelp-50k_vary_reg_split1"? Besides, I didn't find a file in .log format in neither the data folder nor the code folder...
@SeverusBing How about make a directory named "log" in the project directory?
A "log" directory with nothing in it? OK I will have a try, so the "fmg_yelp-50k_vary_reg_split1.log" is a result document?
@SeverusBing It's a log file that records the information that the program output when running, you can test and have a look then:-)
I see! Thank you so much!
@PhoenixZhao Thanks! And If my rating has continuous number, I mean, the label is continuous, can I use your code?
@fanyike I think so. You can try it then.
@PhoenixZhao Do you mean that this code can be used to continuous rating?
@fanyike Yes, this code can run no matter the rating is discrete or continuous.
Thanks for your reply! Excellent work.
@PhoenixZhao Any introduce about have to use custom data. The yelp data format is very complicate
@fanyike Actually my code can run without the Yelp format, and I have preprocessed the data for my code. You can look at the detail of the data I released.
Sorry to bother you again, could you tell me the meaning of these data in the mf_features/path_count folder? I noticed that there are 11 columns in each data file, but I can't find out what these columns represent, as I see the file name, such as "UNBUB_top500_item", I guess the "UNBUB" is a meta-path or meta-graph with five node, so I'm confused about why there are 11 colunms in the data file. Please kindly give some inspiration to me, thank you very much!
@SeverusBing In general, these files record the latent features obtained from the corresponding meta-graphs(read the paper for the details). And in our experiments, we set the rank of MF to 10, thus we obtain 10-dimensional vector for users and items. The first column represents the id of user or item, and the remaining 10 represents the latent features then.
@PhoenixZhao Oh, I see! Then, are these matrices we get in the fm_res folder the results of the code? Would you mind telling me what the P or W in the file name means? Such as "yelp-50k_split1_P" or "yelp-50k_split1_W", thanks!
@SeverusBing W and P represent the variable W and V in the factorization machine. Here I used P to denote V because of some coding problems. Sorry for the inconvenience.
@PhoenixZhao Got it! Thanks soooooo much! But I wonder why there is only one column in each W matrix? Cause I knew W is the first-order weights for features, and I noticed that both of the matrix W and P have 240 rows, so I'm confused about how to recover the rating matrix to get the recommendation results...
Please tell me how to run the data. Should I download the data myself?