Torvaney / ggsoccer

Plot soccer event data in R/ggplot2
https://torvaney.github.io/ggsoccer/
Other
178 stars 27 forks source link

The radius of the penalty arc is slightly too large #35

Closed mdonoghoe closed 5 months ago

mdonoghoe commented 5 months ago

At the moment the penalty arc is defined to have a radius equal to the distance from the goal line to the penalty spot.

In reality, its radius is 10 yards; slightly smaller than the 12 yards of the penalty spot. I know this is a minor quibble, but it throws out my plots slightly when a shot from 24 yards out appears to be on the edge of the arc (which should be 22 yards out).

It seems that not all of the data provider pitch dimensions are scaled consistently (e.g. the penalty_box_length is not always 3 * six_yard_box_length), and there are many possibilities that would give slightly different results on strangely scaled pitches (e.g., using the distance from the penalty spot to the edge of the penalty area to rescale the radius), but I think a simple solution would be to set the arc radius to be 10 / 12 * penalty_spot_distance.

The same could be applied to the radius of the centre circle, which is currently also set to the penalty spot distance.

Thanks for this great package!

Torvaney commented 5 months ago

Hi Mark,

Thanks for your feedback. I think your suggestion of rescaling the distance to 10/12 sounds good. I agree that having shots show up too close to the line is frustrating.

I'm away at the moment, but I should be able to take a look and make sure it works with all the implemented data providers in a couple of weeks. As you suggest, the main issue with the penalty arc is that some providers coordinates can end up making an accurate arc look a bit weird (or even fail to show up at all!).

If you feel like drawing up a PR in the meantime, by all means go ahead 😄

Torvaney commented 5 months ago

Okay, so using the following script, I generated blank pitches for each set of dimensions:

library(ggplot2)
library(ggsoccer)
library(patchwork)

generate_blank_plot <- function(spec) {
  p <- 
    ggplot() +
    annotate_pitch(dimensions = spec$dim, goals = goals_line) +
    theme_pitch() +
    ggtitle(spec$name)

  p
}

all_pitches <- 
  list(
    list(dim = pitch_impect, name = "Impect"),
    list(dim = pitch_statsbomb, name = "Statsbomb"),
    list(dim = pitch_statsperform, name = "StatsPerform"),
    list(dim = pitch_tracab, name = "Tracab"),
    list(dim = pitch_wyscout, name = "Wyscout")
  ) |> 
  purrr::map(generate_blank_plot) |> 
  purrr::reduce(function(x, y) x + y) +
  plot_layout(nrow = 1)

Current implementation (1:1 scaling):

pitch-comparison

Suggested implementation (10/12 scaling):

pitch-comparison-scaled

I think this largely comes down to aesthetics and taste. Some of the dimensions (those which are closer to real pitches) look better with the more realistic scaling, others (Wyscout) look really bad. Or at least, that's my opinion...

In any case, I don't think there's a one-size fits all solution. Seeing as how it's a matter of taste, I think the best option is probably to add an optional field to the pitch dimensions to control the arc radius, with a sensible default.

What do you think?

mdonoghoe commented 5 months ago

Thanks for looking into this. I also had a go at implementing it (including an additional edit to use grid::arcCurvature to calculate the correct geom_curve curvature) and basically got the same results as you.

As you noted, it doesn't work well for the pitches where the aspect ratio of the original scale is 1:1 (StatsPerform / Opta & Wyscout). I think the issue is that the start and end points of the curves are defined based on where a circle intersects the penalty box / halfway line, but on a 1:1 pitch it should actually be an ellipse-like shape, with a longer 'radius' in the vertical direction. In other words, 10 yards vertically is more 'units' than 10 yards horizontally, and importantly we are using the horizontal distance to define the radius, which is why the circles appear smaller for these pitches (even with the current implementation). If drawn using geom_line, this shape would be transformed back into a circle when plotted on a 68:105 pitch. But geom_curve doesn't seem to work like that anyway, because it plots a circle no matter what aspect ratio is used.

I think I'd need a bit more time to work this out properly (geom_curve is still a bit of a mystery to me), but I agree that adding a parameter for arc radius would be a simple fix. Maybe setting the default to be penalty_spot_distance would be a good choice for backwards compatibility.

Torvaney commented 5 months ago

including an additional edit to use grid::arcCurvature to calculate the correct geom_curve curvature

Oh, great spot! Do you mind if I make this change?

it should actually be an ellipse-like shape

Ah yes, of course - if we wanted to do it properly, we'd need to account for the weird geometry of the data providers' pitches (although I suppose you could argue it's undefined outside of the lines?).

Maybe setting the default to be penalty_spot_distance would be a good choice for backwards compatibility.

Yeah, I was thinking the same. I think I'll do that for now.

Thanks for your help - let me know if you come up with any other ideas :)

Torvaney commented 5 months ago

FWIW:

With approximate arc curvature calculation: pitch-comparison-old-arc

With correct arc curvature: pitch-comparison

It's not so noticeable here, but if you flick between the images you can see the difference. I do think it helps on the Wyscouts and your StatsPerforms

mdonoghoe commented 5 months ago

Do you mind if I make this change?

Of course, go ahead.

Thanks for your interest in such a minor quibble!