Open jamessdixon opened 3 years ago
@jamessdixon
Since you are using f# you dont have to fully qualify all of your functions. The code snippet on this page will get you part way there:
http://lefthandedgoat.github.io/canopy/
configuration.chromeDir <- executingDir()
reporter <- new LiveHtmlReporter(Chrome, configuration.chromeDir) :> IReporter
start chrome
url "https://finance.yahoo.com/quote/MSFT/press-releases"
that should get you there, if you have any specific errors let me know
Also here is another thread that may help you. https://github.com/lefthandedgoat/canopy/issues/457
The zip file here will help: https://github.com/lefthandedgoat/canopy/issues/385
It's really really old, but the structure should be about the same
Thanks - I downloaded the Chromium Driver for my MacBook from here: http://chromedriver.storage.googleapis.com/index.html?path=88.0.4324.27/ http://chromedriver.storage.googleapis.com/index.html?path=88.0.4324.27/
I then ran it from the command line to get rid of the security prompt and then ran it from the F# script - worked like a champ
1 final question - how do I extract the html after the page load from the chrome type?
On Jan 3, 2021, at 12:55 PM, Amir Rajan notifications@github.com wrote:
The zip file here will help: #385 https://www.google.com/url?q=https://github.com/lefthandedgoat/canopy/issues/385&source=gmail-imap&ust=1610308529000000&usg=AOvVaw1_E33W-7TnflF_Th3Js4RO It's really really old, but the structure should be about the same
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://www.google.com/url?q=https://github.com/lefthandedgoat/canopy/issues/509%23issuecomment-753668026&source=gmail-imap&ust=1610308529000000&usg=AOvVaw3FIQKRjn2ozttmSn3jGg8s, or unsubscribe https://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAKUPVGPTZWV2RFY2SMOTYLSYDDTBANCNFSM4VSB6I5A&source=gmail-imap&ust=1610308529000000&usg=AOvVaw2VM2USBdQGoGwU3j6RXwDz.
Something like
let marketCap = read "[data-test='MARKET_CAP-value'] span"
from: https://lefthandedgoat.github.io/canopy//Docs/actions.html
canopy lets you use xpath/css/jquery selectors by default. You can use $$("valid css selector")
in chrome dev tools to test your selectors like the one above. Here is a handy cheat sheet: https://oscarotero.com/jquery/
You may be better off using an http api to get your stock data, scraping webpages is painful.
I was looking into this one recently: https://www.alphavantage.co/
Yeah, I think you might be right I’ll check out alpha advantage
Thanks for your help - I think canopy is great
On Jan 3, 2021, at 4:43 PM, Chris Holt notifications@github.com wrote:
Something like
let marketCap = read "[data-test='MARKET_CAP-value'] span" from: https://lefthandedgoat.github.io/canopy//Docs/actions.html https://www.google.com/url?q=https://lefthandedgoat.github.io/canopy//Docs/actions.html&source=gmail-imap&ust=1610322208000000&usg=AOvVaw2TdEhHSACjk0KNyWLAZl3r canopy lets you use xpath/css/jquery selectors by default. You can use $$("valid css selector") in chrome dev tools to test your selectors like the one above.
You may be better off using an http api to get your stock data, scraping webpages is painful.
I was looking into this one recently: https://www.alphavantage.co/ https://www.google.com/url?q=https://www.alphavantage.co/&source=gmail-imap&ust=1610322208000000&usg=AOvVaw0YtfEBMT9IxN7REJ0wX24i — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://www.google.com/url?q=https://github.com/lefthandedgoat/canopy/issues/509%23issuecomment-753693833&source=gmail-imap&ust=1610322208000000&usg=AOvVaw2fn4FKMMKDvFqNI68V3R04, or unsubscribe https://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAKUPVDTIQRBVVIYC7YAIVDSYD6J5ANCNFSM4VSB6I5A&source=gmail-imap&ust=1610322208000000&usg=AOvVaw22hFUUE9MiFCOTBkusZu2R.
this might help you with http stuff: https://github.com/amirrajan/exploring-fsharp/blob/master/002/002.fsx
Chris:
I am trying to use Canopy to scrape some web pages - pass in a uri and getting the html document from a page (after the javascript runs)
I opened .fsi and wrote this code:
r "nuget: Canopy"
r "nuget: Selenium.WebDriver.ChromeDriver"
open canopy.classic open canopy.configuration open canopy.runner.classic
//canopy.configuration.chromeDir <- System.AppContext.BaseDirectory canopy.configuration.chromeDir <- executingDir canopy.configuration.elementTimeout <- 3.0 canopy.configuration.pageTimeout <- 3.0 let reporter = canopy.reporters.LiveHtmlReporter(Chrome, canopy.configuration.chromeDir) :> canopy.reporters.IReporter canopy.classic.start(canopy.types.BrowserStartMode.Chrome, reporter) let url = "https://finance.yahoo.com/quote/MSFT/press-releases" canopy.classic.url(url)
except IReporter is defined in the ConsoleReporter type? Am I even on the right track to do this?