3878anonymous / CAML

33 stars 11 forks source link

How TO preprocessing text? #2

Open tienyeung opened 4 years ago

tienyeung commented 4 years ago

I dont find the .py about how to preprocessing text?could you push it?

3878anonymous commented 4 years ago

I dont find the .py about how to preprocessing text?could you push it? have you found the solution to preprocess text ? I also don't know how to drag crucial concepts from reviews

Feel so sorry for missing this question... Due to some copyright issues, I cannot release the preprocessing code and also it is not implemented by python... The process of concept extraction is:

  1. review tokenization, word segmentation, and pos tagging
  2. filter words by pos tags and only keep nouns
  3. filter remaining words by requesting Microsoft Concept Graph API(/api/Concept/ScoreByProb), and remove words which return NULL (optional);
  4. filter remaining words by popularity and get the final concept dict.
gitshuqing commented 3 years ago

Can you email me the code? Due to limited personal ability, I really can't handle it well, but I really want to learn, thank you very much!