Elaine77777 / lab-07

Lab 07: Dataset alchemy
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Lab 07 feedback #1

Open Elaine77777 opened 7 months ago

Elaine77777 commented 7 months ago

1) I learned how to convert rda files into csv files, even though that was not my intention in the first place, but it was a happy accident, and I did learn a lot from it. I also learned how to process data and use 'bing' function to prepare the data for sentiment analysis.

2) I actually found getting started the most challenging, and by that I mean trying to figure out how to install the packages in the first place, but after figuring that out, the rest became relatively easy. Learning about how the key works in class was super helpful.

3) Mostly instructor, but also Chatgpt for debug, which was extremely efficient

4) I would like to know, in the future, suppose that I want to conduct a sentiment analysis with a movie script, do I keep the non-content words, because in this lab, we decided to keep to make the processed dataset as close as possible to the orginally dataset, but do we always do that? The potential resource I will certainly use in the future is still Chatgpt: https://chat.openai.com/

francojc-lin380s24 commented 7 months ago

Great self-assessment. Let me comment on #4. In sentiment analysis, many of the non-content words will not have much meaning. They tend to be function words that by themselves do not add to the overall sentiment. That is why they are usually removed, or ignored. If you are using an algorithm (say vader or sentimentr) that does sentiment analysis at the sentence, paragraph, or document level, you may need to keep these function words as they can/ do contribute to meaning, just not at the word level.