Closed Dazzalytics closed 2 years ago
I think this might depend on your operating system. Here's what I get on ubuntu:
library(cricketdata)
fetch_cricsheet(type = "match", gender = "male", competition = "psl")
#> # A tibble: 206 × 25
#> match_id balls_per_over team1 team2 gender season date event match_number
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1075986 6 Islamab… Pesh… male 2016/… 2017… Paki… 1
#> 2 1075988 6 Karachi… Pesh… male 2016/… 2017… Paki… 3
#> 3 1075995 6 Islamab… Kara… male 2016/… 2017… Paki… 10
#> 4 1075997 6 Islamab… Pesh… male 2016/… 2017… Paki… 12
#> 5 1076001 6 Lahore … Pesh… male 2016/… 2017… Paki… 16
#> 6 1076005 6 Islamab… Kara… male 2016/… 2017… Paki… 20
#> 7 1076007 6 Karachi… Isla… male 2016/… 2017… Paki… <NA>
#> 8 1076008 6 Peshawa… Kara… male 2016/… 2017… Paki… <NA>
#> 9 1075994 6 Peshawa… Quet… male 2016/… 2017… Paki… 9
#> 10 1075990 6 Karachi… Quet… male 2016/… 2017… Paki… 5
#> # … with 196 more rows, and 16 more variables: venue <chr>, city <chr>,
#> # toss_winner <chr>, toss_decision <chr>, player_of_match <chr>,
#> # umpire1 <chr>, umpire2 <chr>, reserve_umpire <chr>, tv_umpire <chr>,
#> # match_referee <chr>, winner <chr>, winner_wickets <chr>, method <chr>,
#> # winner_runs <chr>, outcome <chr>, eliminator <chr>
fetch_cricsheet(type = "player", gender = "male", competition = "psl")
#> # A tibble: 4,534 × 3
#> team player match_id
#> <chr> <chr> <chr>
#> 1 Islamabad United DR Smith 1075986
#> 2 Islamabad United Sharjeel Khan 1075986
#> 3 Islamabad United BJ Haddin 1075986
#> 4 Islamabad United SR Watson 1075986
#> 5 Islamabad United SW Billings 1075986
#> 6 Islamabad United Misbah-ul-Haq 1075986
#> 7 Islamabad United Imran Khalid 1075986
#> 8 Islamabad United Amad Butt 1075986
#> 9 Islamabad United Saeed Ajmal 1075986
#> 10 Islamabad United Mohammad Sami 1075986
#> # … with 4,524 more rows
Created on 2022-02-20 by the reprex package (v2.0.1)
I'll need to find a Windows computer to test it on.
@jacquietran Are you using Windows? Can you replicate this issue?
Hey @robjhyndman and @Dazzalytics !
I can reproduce the issue on Windows:
> library(cricketdata)
> fetch_cricsheet(type = "match", gender = "male", competition = "psl")
# trying URL 'https://cricsheet.org/downloads/psl_male_csv2.zip'
# Content type 'application/zip' length 1002739 bytes (979 KB)
# downloaded 979 KB
# A tibble: 206 x 25
# match_id balls_per_over team1 team2 gender season date event
# <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 "C:\\Users\~ 6 Islam~ Pesha~ male 2016/~ 2017~ Paki~
# 2 "C:\\Users\~ 6 Karac~ Pesha~ male 2016/~ 2017~ Paki~
# 3 "C:\\Users\~ 6 Islam~ Karac~ male 2016/~ 2017~ Paki~
# 4 "C:\\Users\~ 6 Islam~ Pesha~ male 2016/~ 2017~ Paki~
# 5 "C:\\Users\~ 6 Lahor~ Pesha~ male 2016/~ 2017~ Paki~
# 6 "C:\\Users\~ 6 Islam~ Karac~ male 2016/~ 2017~ Paki~
# 7 "C:\\Users\~ 6 Karac~ Islam~ male 2016/~ 2017~ Paki~
# 8 "C:\\Users\~ 6 Pesha~ Karac~ male 2016/~ 2017~ Paki~
# 9 "C:\\Users\~ 6 Pesha~ Quett~ male 2016/~ 2017~ Paki~
# 10 "C:\\Users\~ 6 Karac~ Quett~ male 2016/~ 2017~ Paki~
# ... with 196 more rows, and 17 more variables: match_number <chr>,
# venue <chr>, city <chr>, toss_winner <chr>, toss_decision <chr>,
# player_of_match <chr>, umpire1 <chr>, umpire2 <chr>,
# reserve_umpire <chr>, tv_umpire <chr>, match_referee <chr>,
# winner <chr>, winner_wickets <chr>, method <chr>,
# winner_runs <chr>, outcome <chr>, eliminator <chr>
> fetch_cricsheet(type = "player", gender = "male", competition = "psl")
# A tibble: 4,534 x 3
# team player match_id
# <chr> <chr> <chr>
# 1 Islamabad United DR Smith "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 2 Islamabad United Sharjeel Khan "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 3 Islamabad United BJ Haddin "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 4 Islamabad United SR Watson "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 5 Islamabad United SW Billings "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 6 Islamabad United Misbah-ul-Haq "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 7 Islamabad United Imran Khalid "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 8 Islamabad United Amad Butt "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 9 Islamabad United Saeed Ajmal "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# 10 Islamabad United Mohammad Sami "C:\\Users\\jacqu\\AppData\\Local\\Temp\\Rtmp29I~
# ... with 4,524 more rows
It should be fixed now: https://github.com/robjhyndman/cricketdata/commit/80ded20df6446647a804a0b97fb0a9e5658becce
Thank you for the prompt feedback.
I have 0.1.1 version of the package and unfortunately, this issue is still there. I do not have package building experience, but I would like to help fix this issue (with some guidance). And possibly look at adding some more functionality to the package or work on ideas that you might already have.
You will need to reinstall the package from github. Then try it. I have tested it on Windows and it worked for me on two computers.
Yes, it is working now. I appreciate your help!
First up, great work in making this package.
I was working with PSL data and noticed that player & match datasets have match ids as a long string of file paths. E.g. This long string C:\Users\AppData\Local\Temp\Rtmpuu2sV2/psl_male_bbb/1075986 shows up as match ids in data, where 1075986 is the match id.
I wrote this simple function below to fix it for myself. But probably there is a way to fix it in the package code. Thank you!
clean_match_id = function(df){ sapply(strsplit(df$match_id, "/"), '[',3) }