morningmoni / HiLAP

Code for paper "Hierarchical Text Classification with Reinforced Label Assignment" EMNLP 2019
141 stars 34 forks source link

cannot run Yelp dataset #4

Closed lanliliana closed 4 years ago

lanliliana commented 4 years ago

I ’m trying to run the model with the Yelp dataset. I ’ve downloaded the dataset from https://www.yelp.com/dataset/challenge. (The file name: "yelp_dataset") However, I cannot run the program "readData_yelp.py". Since the code opens two files ( line 9:'yelp / Taxonomy_100' ; line 34: ’'yelp / yelp_data_100.csv') which I couldn't understand where do they come from. Am I missing any steps to get these files? I cannot find any code to generate those two files from the original yelp dataset.

Thank you for all your assistance.

morningmoni commented 4 years ago

Hi, I added the yelp taxonomy (yelp/Taxonomy_100) I used in the experiment. For the yelp/yelp_data_100.csv, it consists of yelp businesses and their categories. I provided a two-line sample file in the same folder (since it's too big). They were generated from the original Yelp dataset and you can generate them yourself by following the same format. Note that the Yelp dataset each year is (slightly?) different though. Let me know if you have other questions.

liuyijiang1994 commented 4 years ago

@morningmoni Hi moni, the Yelp dataset each year is different as you said, could you please provide me a copy of the csv file used in your word? The newest version I can download is from Kaggle, which may be VERY different from the version you used. Could you tell me which version of YELP dataset you used, or upload it to Google drive or Baidu Net Disk, or send it directly to my email: cslyj@whu.edu.cn Ii it's convenient? Thanks a lot!