flash121 / ReviewAnalysis

Extract Review & Doing Analysis
1 stars 0 forks source link

Crawel System for Review Extraction #1

Open flash121 opened 10 years ago

flash121 commented 10 years ago

Completed: located the review table in each table

flash121 commented 10 years ago

Completed: Pre-Processing Extract userid, title, content, star for future use

Completed: read the total number of page

Incomplete mission

  1. walk through all the page's and read all the review in memory
  2. pump the result in json
  3. read result
flash121 commented 10 years ago

walk through, pick one, dump, read finished,

Next step: generate Mmcorpus file, using gensim