cpsievert / pitchRx

Tools for scraping MLB Gameday data and Visualizing PITCHf/x
http://cpsievert.github.io/pitchRx/
Other
123 stars 33 forks source link

Alternate Datasets access to functions #5

Closed trevorpatch closed 11 years ago

trevorpatch commented 11 years ago

Thank you for this package, even without all of the functionality it looks absolutely amazing. I am hoping to take the data I already have and transform it into a way to take advantage of this package.

These are the column headings I have:

[1] "Pitcher.MLBAM_PLAYER_MASTER_name_last" [2] "Pitcher.MLBAM_PLAYER_MASTER_name_first" [3] "MLBAM_PLAYER_MASTER_name_first"
[4] "MLBAM_PLAYER_MASTER_name_last"
[5] "game_pk"
[6] "game_id"
[7] "sv_pitch_id"
[8] "sequence_number"
[9] "at_bat_number"
[10] "pitch_number"
[11] "inning"
[12] "top_inning_sw"
[13] "event_number"
[14] "event_type"
[15] "pbp_number"
[16] "event_result"
[17] "pre_balls"
[18] "pre_strikes"
[19] "post_balls"
[20] "post_strikes"
[21] "batter_id"
[22] "bat_side"
[23] "pitcher_id"
[24] "throws"
[25] "initial_speed"
[26] "init_pos_x"
[27] "init_pos_y"
[28] "init_pos_z"
[29] "init_vel_x"
[30] "init_vel_y"
[31] "init_vel_z"
[32] "init_accel_x"
[33] "init_accel_y"
[34] "init_accel_z"
[35] "plate_speed"
[36] "plate_x"
[37] "plate_y"
[38] "plate_z"
[39] "break_x"
[40] "break_z"
[41] "pitch_type"
[42] "pitch_name"
[43] "time_stamp"
[44] "game_date"
[45] "game_nbr"
[46] "year"
[47] "game_type"
[48] "sz_top"
[49] "sz_bottom"
[50] "pitch_type_confidence"
[51] "spin_rate"
[52] "spin_dir"
[53] "pfx"
[54] "x0"
[55] "y0"
[56] "z0"
[57] "vx0"
[58] "vy0"
[59] "vz0"
[60] "ax"
[61] "ay"
[62] "az"

The final 9 headings are just column name change of the "init_pos_x" to fit the package's naming conventions.

This allows me to use the interactiveFX function without any problems. However I cannot figure out how to turn my dataset into a form where I can use the rest of the functions. The first problem is that I cannot call the stand function to create the strikezones in the strikeFX and animateFX functions. How would you suggest go about forming the dataset into ways to take advantage of these cool functions!

Thank you!

cpsievert commented 11 years ago

strikeFX assumes the data frame input has names of "stand" (batter stance - which should have values of "L" and/or "R") and "b_height" (ie, batter height - which should have character values of format "ft-in").

It looks as though your variable "bat_side" should correspond to stand. Assuming your data set is named dat, you could do the following:

names(dat) <- gsub("bat_side", "stand", names(dat))

I don't see a variable that might correspond the batter's height. Worst come to worst, you could just "guesstimate" batter's height by doing the following:

dat$b_height <- rep("6-0", dim(dat)[1])

This should at least get you started. In the future, I may try to relax the assumption that the user will have such detailed batter information.

trevorpatch commented 11 years ago

Thank you for your response I will try that and let you know how it goes!

Another question. I am mainly using this to see the differences in flight paths for different pitches at different environments (Coors field versus sea level etc) but it seems that by using initial position, velocity, acceleration it will not take into account different environments and will produce plots for a pitch in a neutral environment. Is this right or am I totally offbase here! Like break amounts (and thus final positions) are a lot different for different pitches depending on where they are thrown, and so I don't know by only including initial stuff it will showcase these differences. Anyway to take into account the final position and break to create these things?

Thanks again!

cpsievert commented 11 years ago

You're correct. These flight paths are recreated using the assumption(s?) of constant acceleration and/or velocity. Thus, I don't think you'll find much of a difference. However, I'm not an expert on this stuff. This website has more details if you're interested.

trevorpatch commented 11 years ago

Thank you for the help again, got another question now doing the strikeFX stuff.

When I run it like this:

strikeFX(pitches, geom = "hex", contour = TRUE, density1 = list(event_type = "called_strike"), density2 = list(event_type = "ball"), layer = facet_grid(Pitcher.MLBAM_PLAYER_MASTER_name_last~ stand))

where Master name last has the last name in it, stand has L or R in it event_type has things such as called strike, ball, swinging strike etc, and pitches is my dataset. However when I run it I get Error in [.data.frame(data, , locations) : undefined columns selected. So I thought it may be freaking out since I didn't specifically tell it what columns so I did this:

strikeFX(pitches, geom = "hex", contour = TRUE, density1 = list(pitches$event_type = "called_strike"), density2 = list(pitches$event_type = "ball"), layer = facet_grid(pitches$Pitcher.MLBAM_PLAYER_MASTER_name_last~ pitches$stand))

but that gives me an error: Error: unexpected '=' in " density2 = list(pitches$event_type ="

so when I switch it to a double equality (==) it gives the Error in [.data.frame(data, , locations) : undefined columns selected error. Any ideas on what I am doing wrong, I am sure it is something dumb I am not seeing!

trevorpatch commented 11 years ago

And for recording your awesome graphics do you use the animate package or something else?

cpsievert commented 11 years ago

It's hard for me to diagnose your error without access to your dataset. The first command you list is the correct syntax. If I had to guess, it may be that your data.frame is not actually called pitches. If it is, try renaming it to something else so it doesn't conflict with pitchRx's example dataset.

As for your second question, I'll assume you're asking how I produced the graphics on the demo page. To create this page, I use knitr and R Markdown. You can see the source code here.

If you know a bit about the animation package, you could use this to produce html files with an embedded animation deck. If this is what you're looking for, have a look at ?animation::saveHTML. This is actually what my shiny visualization app uses. However, this won't be as customizable as the Markdown approach. As with anything, there are multiple ways to skin a cat. It really depends on what you want to do.

cpsievert commented 11 years ago

Aha. I should have caught this earlier. strikeFX also assumes that you have the variables named px and pz in your data.frame. These variables are a record of the vertical and horizontal location of the pitch as it crosses home plate. I'm not too sure which variables those correspond to in your dataset, but you'll want to rename the appropriate columns accordingly.

trevorpatch commented 11 years ago

You are awesome. That totally works now! I was able to figure it out with the interactivefx one because the error says something like it was looking for x0 y0 z0, vx0 etc, while the strike fx one only gives the undefined columns error. I made some plots and they look so awesome. Looking forward to the updates you bring to it!