ernbilen / Data400_Fall24

Course page for Data 400, Fall 24 at Dickinson College.
7 stars 13 forks source link

Idea 2 & 3 #18

Open seleneng opened 4 weeks ago

ernbilen commented 1 week ago

Great topic- text analysis really hot these days because of chatgpt. If you can find any results and tie it with fake product reviews that could be really interesting. Check this out in case it's helpful: https://pubsonline.informs.org/doi/abs/10.1287/mksc.2022.1353 If not, I think generally trying to predict score based on sentiment may not be as interesting, given that in almost any ML application you would have access to both text and scores. It usually makes sense if you had only access to say only text and try to predict scores. So maybe you could try find interesting patterns, say through word clouds and such (of course, after removing stopwords). You can also look into text summarizing which Amazon has recently been using. Like, you feed the model 10 reviews and it summarizes them and outputs a 2-3 sentence summary about what people think about the product. But again, this may not be too interesting given Amazon already has something like that. You could try doing a product or manuf. analysis perhaps, looking at what types of products are liked better, if there are certain brands that people like a lot etc.