acstat231-f23 / blog-yay-soccer

Gretta Ineza, Sierra Rosado, Lindsay Ward
https://acstat231-f23.github.io/blog-yay-soccer/
0 stars 0 forks source link

Blog Plan #1

Open srosado21 opened 8 months ago

srosado21 commented 8 months ago
  1. We plan to use a different dataset for this project. We will be using a soccer dataset from Kaggle: https://www.kaggle.com/datasets/kriegsmaschine/soccer-players-values-and-their-statistics. Using this data set, we plan to explore spatial data and network connections. Spatial data: a world map that is colored based on how many players are in a chosen league from each country. What countries are players from the most in a certain league? (Might have to use Shiny and link it because of interactivity). Network connections: goals scored between teams. For example, arrows would point between teams and be colored/sized based on the number of goals it represents. What clubs have scored the most goals between each other?
  2. Our map is interactive becuase the user can pick which league they want to see. However, we might limit our choices to 3 options, so we might simply include the three maps as images. What do you think? Should we link a shiny application or would making 3 maps and including them as images make more sense? For the network connections, I believe we will produce 3 different connection graphs and include them as images.
  3. Schedule: By status update 1: have wrangled data. By status update 2: have initial code done (maps and network connection graphs). By class feedback: add rough draft of what we want to write. By final blog post: using feedback, update code for maps and network connections. Also, finalize writing.
katcorr commented 8 months ago

I like your idea to switch to looking maps and networks within the professional soccer world. I see in the Kaggle dataset linked, information needed for the map ideas. But I don't see information on game results (scores from games) that you would need for the network. Did you also find a data source for that part of the plan?

I think if there are only 3 leagues, then it makes sense to include 3 static maps in the blog post rather then making it interactive through shiny. And this would then be consistent with your three static network visualizations too.

For your schedule, the timeline sounds reasonable. Do you also have a plan for how often you'll meet (and will the meetings have set agendas or will it be used to work on some part together)?

Blog Plan: 10/10

srosado21 commented 7 months ago

Status update 1: We are a bit behind schedule since the original dataset we found requires a lot of wrangling and also does not have some data needed for the network connection. We have been trying to find another dataset that works for the network connection. Therefore, we will update our timeline. Instead of having the wrangled dataset done by today, we want to have that done by status update 2. Moreover, we still want to have the initial code done by status update 2. Initial code means having all the rough code needed for the maps and network connections. It is okay if we have errors and need to fix them in order for these maps and plots to show up. We plan to meet within a few days after status update 2 to work on fixing errors together and making sure the code works in the shared repo. We will then work separately on the writing portion of the blog project and then come back together to go over it one more time before class presentations. After that, we will meet once or twice more to adjust the blog based on peer feedback.

katcorr commented 7 months ago

@srosado21 @lindsaywardd @grettain

Ok! You mention

"Instead of having the wrangled dataset done by today, we want to have that done by status update 2. Moreover, we still want to have the initial code done by status update 2. Initial code means having all the rough code needed for the maps and network connections. It is okay if we have errors and need to fix them in order for these maps and plots to show up."

That's two big-ish checkpoints now to be done by status update 2; in other words, it might be hard to finalize the data wrangling on 11/30 AND then also do the initial code for the maps and network connections on that date. Can you break it down further -- by what (earlier) date will the data wrangling be done? And then how much time will that give you between wrangling and the initial code for maps and networks?

srosado21 commented 7 months ago

We want to have the wrangling done by the 28th and then have the initial code done by status update two (11/30). That will give us two/three days to get the initial code which should hopefully be more than enough time since we already have some base code to follow from the labs and homework.

katcorr commented 7 months ago

Ok, thanks for the update!

Status Update 1: 5/5

srosado21 commented 7 months ago

Status Update 2

We are up to date with our schedule. We have the wrangled data and the initial code for the maps and network connections (they work but need some tweaking to make the results look nicer).

Before 12/7, we want to update our code so we get the results we want. These can be adjusted after peer review to incorporate constructive criticism. We also want to add some initial writing about the maps and network connections produced (mainly as a brief outline for presentations).

Hopefully, all we will have to do after the presentations is finalize our writing and adjust the code for the maps and network connections if necessary.

katcorr commented 7 months ago

Sounds good!

Status Update 2: 5/5