stat231-f20 / Blog-MoneyMovers

Repository for PUG Blog Project – Money Movers
https://stat231-f20.github.io/Blog-MoneyMovers
0 stars 0 forks source link

Scraping the full Yahoo Finance table (scrolling webpage) #2

Open katcorr opened 4 years ago

katcorr commented 4 years ago

I've pushed this code to your team repo that works to scrape all the data for "MMM" company.

It's kind of neat/freaky to watch the web browser be controlled automatically. You may need to update your version of Chrome if you don't have the latest version. Then:

Test this out and see if this single webpage test case works on your computers. Then, if needed, I can help incorporate it into your for loop to loop through all the web pages.

katcorr commented 4 years ago

If you run into an error: "Selenium server signals port = 4837 is already in use." try changing the port number in the rsDriver command, e.g. rD <- rsDriver(port = 4000L, browser = "chrome")

katcorr commented 4 years ago

@zostrow2001 @luwilliam20

I know you mentioned you may just use what's initially scraped (without scrolling), but just in case you have time to go back and add this additional data . . . We were receiving errors related to the chrome browser before. Turns out, there's an argument where you can specify which chrome browser version you're using:

# Make sure your driver version matches the version of chrome you have installed
rD <- rsDriver(browser="chrome",chromever = "85.0.4183.83")

Update the chromever number to what version you have -- that should fix the issue!