beanumber / baseball_R

Companion to Analyzing Baseball Data with R, 2nd edition
95 stars 62 forks source link

Issues with parse_retrosheet_pbp() files #2

Closed garyhogaryho closed 10 months ago

garyhogaryho commented 5 years ago

Hi Ben,

When I tried to duplicate the script in the beginning of Chapter 5, it seems that the "all2016.csv" file has not data in it. The "ros2016.csv" works fine. I tried this for other years as well (1950 as mentioned in the appendix, and 2018), the "all1950.csv" and "all2018.csv" have no data in it but the roster files do.

Attached is a screenshot for your reference. Thanks!

image

beanumber commented 5 years ago

@garyhogaryho What happens if you run create_csv_file(1950)?

I suspect this function is breaking and that is why the all1950.csv has no data in it.

Can you also run:

file.info("data/all1950.csv")
garyhogaryho commented 5 years ago

@beanumber I tried running both and it still didn't seem to work, I tried it for both the 1950 and 2016 season. I received the following errors when i tried this for the 2016 season:

create_csv_file(2016) cwevent -y 2016 -f 0-96 2016.EV > all2016.csv

Chadwick expanded event descriptor, version 0.7.0 Type 'cwevent -h' for help. Copyright (c) 2002-2017 Dr T L Turocy, Chadwick Baseball Bureau (ted.turocy@gmail.com) This is free software, subject to the terms of the GNU GPL license.

Can't find teamfile (team2016) Warning message: In shell(cmd) : 'cwevent -y 2016 -f 0-96 2016.EV > all2016.csv' execution failed with error code 1

file.info("data/all2016.csv") size isdir mode mtime data/all2016.csv 0 FALSE 666 2019-03-16 15:09:53 ctime atime exe data/all2016.csv 2019-03-16 15:05:59 2019-03-16 15:09:53 no

beanumber commented 5 years ago

What is in data/unzipped/? Can you try running create_csv_roster() and then create_csv_file() afterwards.

This is very hard to debug without a complete list of your files. It looks like cwevent is looking for a team2016 file that it can't find. Either it didn't get created, or it's looking for it in the wrong place.

garyhogaryho commented 5 years ago

Hey Ben, I'll try to show what I did and maybe you can find the error. I'll do it in regards to the 1950 season according to the appendix. I created a new folder called "retrosheet" in the working directory "test1"

image

I then created the "zipped" and "unzipped" folders in the "retrosheet" folder

image

I placed file cwevent.exe from the chadwick files into the "unzipped" folder

image

In Rstudio I ran source("parse_retrosheet_pbp.R") then parse_retrosheet_pbp(1950)

image

In the "unzipped" folder I now have these files:

image

This is where I get confused. I believe I am supposed to move these files into the "data" folder of my working directory, which I did. Here are the files in the folder:

image

If I run the code exactly as it says in the appendix, "fields" and "rosters" work, but there is nothing in the "data" as it has 0 obs. of 0 variables

image

Then I tried troubleshooting with what you told me to do with a new R project and this is what I got

image

As you mentioned, I don't know where this team1950 file is supposed to be located at. I also noticed that when I initially parsed the file there is this error:

cwevent -y 1950 -f 0-96 1950.EV > all1950.csv

Chadwick expanded event descriptor, version 0.7.0 Type 'cwevent -h' for help. Copyright (c) 2002-2017 Dr T L Turocy, Chadwick Baseball Bureau (ted.turocy@gmail.com) This is free software, subject to the terms of the GNU GPL license.

[Processing file 1950.EV.] Warning: could not open file '1950.EV'

Anyway, thanks for taking the time to look this over and help me. If we are unable to get this to work, no worries I'll move onto another chapter of the book.

beanumber commented 4 years ago

Can you try running this command once you are already in the same folder as the *.EV files?

thooper1 commented 4 years ago

Ok. So here is what I did. I ran the parse function piece by piece. No dice. I saw the TEAM2019 file in the zipped folder but it still failed. Could this be because it's looking for "2019.EV because there is no '2019.EV' folder?

image1

beanumber commented 10 months ago

Please note that our advice about how to download Retrosheet files has changed, and I hope that this is no longer a problem:

https://beanumber.github.io/abdwr3e/A_retrosheet.html#downloading-play-by-play-files