Closed jestarr closed 6 years ago
steamer data has probably changed how it returns since I wrote these functions!
Any plans to update this package for 2018?
I probably should!
Anything :+1: ?
Seems reasonable- can you look for the 2018 URL and post it here?
On Wed, Mar 21, 2018, 8:49 AM jestarr notifications@github.com wrote:
Anything 👍 ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/almartin82/projprep/issues/49#issuecomment-374925163, or mute the thread https://github.com/notifications/unsubscribe-auth/AAvvN-MuOK3RpxCe6i5sjeAUqoplnlTcks5tgkw-gaJpZM4QwndE .
okay, this was actually a fairly easy fix! looks like fangraphs (rightly) pushes all traffic to https
, but the html parser I am using is pretty old, and only handles http
traffic. the solution was an intermediate step where we read the content in using RCurl::getURL
and then pass to XML::readHTMLTable
.
Should probably move all of this to rvest
/ httr
, which is the more current way of handling web content, but this seems to work for now.
@jestarr let me know if this solves steamer / fangraphs for you.
hmm now I get
Error in select_impl(.data, vars) :
found duplicated column name: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
issue is in clean_raw_fangraphs
...
It's taking forever to pull down any data using get_steamer. Would rvest make the scrape quicker? I wish Fangraphs had an open source API.
It's pretty slow (5-10 min?) but it should resolve.
The scraping strategy was designed to be comprehensive but not fast. If memory serves it traverses every team x every position - so there are a lot of calls happening behind the scenes.
On Thu, Mar 29, 2018, 1:52 PM jestarr notifications@github.com wrote:
It's taking forever to pull down any data using get_steamer. Would rvest make the scrape quicker? I wish Fangraphs had an open source API.
— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/almartin82/projprep/issues/49#issuecomment-377318477, or mute the thread https://github.com/notifications/unsubscribe-auth/AAvvN_hvVcfyVjge04Y9Fnktn54o1iQDks5tjR9fgaJpZM4QwndE .
I'm getting the following error message: "Error in names(df)[2] <- "fg_note" : 'names' attribute [2] must be the same length as the vector [1]"