acstat231-f23 / blog-network-mammoths

Daizy Buluma, Christian Manzi, Clara Hoey
https://acstat231-f23.github.io/blog-network-mammoths/
0 stars 0 forks source link

Blog Plan #1

Open choey24 opened 1 year ago

choey24 commented 1 year ago

(Couldn't include the table by copying and pasting text, so attached blog plan as PDF!)

PUG project 2: Blog Post.pdf

katcorr commented 12 months ago

image image image

katcorr commented 12 months ago

Great ideas regarding this new topic, and nice detailed schedule. I think the clustering analysis and text analysis will be interesting. In your schedule, you mention two shiny apps, but that seems disconnected from the rest of the plan. Interactivity is not required in the blog post / for this project, and I didn't see any mention of interactivity in the proposed ideas otherwise. Do you really need the shiny apps or could you create static visualizations based on your clustering and text analyses that you include on the blog post website?

Blog Plan: 10/10

cmanzi00 commented 11 months ago

STATUS UPDATE #1

In response to the feedback we received, we refined our project schedule, opting for static visualizations over Shiny apps and adjusting deadlines. Currently, we are adhering to our updated schedule, having achieved two milestones: preliminary data wrangling and a reassessment of our questions of interest. Our focus has shifted towards performing text analysis on the wrangled data, aiming to identify the most frequently used programming languages and those associated with higher developer compensation, organized by country. In addition, we would like to understand developers' sentiments towards AI tools from their open-ended responses to the survey. Finally, we plan to explore the correlation between developers' language preferences/tools used and their levels of experience. Our upcoming tasks involve the distribution of responsibilities, initiating text analysis on developers' attitudes towards AI tools based on survey responses, and designing static visualizations to present our findings effectively.

Updated schedule for reference:

image
katcorr commented 11 months ago

Ok, sounds good!

Status Update 1: 5/5

choey24 commented 11 months ago

Status update 2:

We are mostly on track with our current schedule. Right now, we have successfully implemented some preliminary text analysis models. The process for this included refining the (already reduced) dataset to include questions of interest. While we have selected questions relating to developers’ feelings about AI, for the text analysis we isolated a variable which contains the developers’ responses to an open ended question asking what their opinion is on Stack Overflow using AI tools, and which AI tools they think would be most helpful. We have been able to do some text analysis on these responses similar to what we did with the Emily Dickinson poems — finding the key words and eliminating filler words. From there, we made a list of several packages in R that we can use for sentiment analysis. So far, we have only used the one we already knew about in the textdata package, but there are a few others that we would like to try as well. We were a little unclear about what other types of analysis besides text/sentiment analysis would look like, so at this point, we haven’t expanded into other types of data analysis. However, we would like to try clustering, and will plan to work on this next week, with the goal of having both the text analysis and clustering ready by next Tuesday according to our schedule above in the first status update.

katcorr commented 11 months ago

Sounds good!

Status Update 2: 5/5