acstat231-f23 / Blog-Internet_Explorers

Alex Nichols, Maximo Gonzalez, Ethan Van De Water
https://acstat231-f23.github.io/Blog-Internet_Explorers/
0 stars 0 forks source link

Blog Plan #1

Open evandewater26 opened 10 months ago

evandewater26 commented 10 months ago
  1. We will not have our blog project be an extension of the mid-semester project. The general topic that we are going to look at is how sentiment among executives at financial service institutions relates to global macroeconomic trends. For example, seeing how the frequency of mentioning ‘war’ impacts certain financial markets i.e. futures contracts for crude oil, the flow of international trade, strength of different currencies. We will also explore how these sentiments influence individual institutions. Data on nations’ imports/exports is widely available, as are trading volumes of different publicly listed securities such as oil futures contracts. Our data regarding using the word ‘war’ among other terms will be harder to acquire. We will likely use a mixture of web scraping and pre-constructed datasets to build our blog.

  2. We foresee our blog including a shiny application and a series of network maps to give context to the story our data conveys. We will also explore the option of a predictive model based on historical data (prior conflicts, trends, etc.) to build our case further.

  3. I think our first goal, similar to our Covid-Economic project, is to gather and coagulate the data we find. If no issues arise in data accumulation (i.e. inability to sidestep web scraping permissions), we will then structure an argument. By Status Update 2, we expect to have our argument and story confirmed and robust. This is an approach our last project taught us – a strong foundation allows us to organize and produce a tighter final project. We’ll then build out primitive versions of the visualizations, models, and applications we want to include in the final blog post. This will be the most coding-intensive part of the project, and our resulting story will be shaped here. Last is the final touch-ups, around the week of 12/7. Presentations will again inform our final decisions.

katcorr commented 10 months ago

Interesting and creative ideas here! You mention a shiny app but it's not clear what interactivity is needed based on the rest of your plan. Remember that a shiny app / interactivity is not necessary so you could include static visualizations (or with more limited interaction via plotly/not shiny) in your blog post.

Regarding the data acquisition -- in class we talked about needing to copy and paste text for some statements into txt files because web scraping isn't allowed. For those cases, for reproducibility purposes, please document the date(s) and website(s) from which you acquired the text.

Blog Plan: 10/10

evandewater26 commented 10 months ago

Update #1 We’ve started to gather and coagulate speeches and articles for text analysis. We haven’t encountered a lot of instances where web scraping is not allowed, but often the format of the websites where the text is found makes web scraping difficult. For this reason, we often have elected to copy and place the text in a .txt file, with reference to the website we pulled said text from. We expected to have more data collected, but group members have had a series of midterm exams and projects that have taken up a good chunk of our time recently. When we return from break, we will work to finish collecting our text data, sourcing our more strictly quantitative data and wrangling our datasets.

katcorr commented 10 months ago

OK! How will you keep each other accountable for progress on the project in the weeks after break? Do you think you can catch up upon return such that you're still on track for your status update 2 deadline to have your argument and story set?

Status Update 1: 5/5

evandewater26 commented 9 months ago

Update #2 We've made good progress collecting and wrangling text data. We collected 10 relevant articles from a series of sources that relate to the global macro environment as well as current global conflicts. From these texts, we performed a text analysis similar to that which we did for Emily Dickinson's poems. The wrangling required doing a deeper dive into some R libraries we didn't much in class (stringr, etc), and after this was completed we had a word frequency table. Preliminarily, we represented these in a visualization that shows the top 15 most frequently used words across the articles that were chosen. We plan to make this interactive through shiny/plotly so that the user can choose the number of words that he/she wants. Overall, I think this piece does a good job of answering what the sentiment among executives and world leaders is through text analysis.

Given this, we are more or less on track to have completed our data story as we forecasted in the initial proposal. Next, we need to work out the implementation of international trade data and the syntactical difficulties of wrangling and creating a network based on data from different sources. It will be a challenging but rewarding exercise.

Overall, we will continue to keep each other accountable by delegating tasks while ensuring that each group member still has a good understanding of the code and concepts that are being used throughout the project.

katcorr commented 9 months ago

Sounds good!

Status Update 2: 5/5