lefthandedgoat / canopy

f# web automation and testing library, built on top of Selenium (friendly to c# also)
http://lefthandedgoat.github.io/canopy/
MIT License
505 stars 117 forks source link

How to run Canopy in .fsi? #509

Open jamessdixon opened 3 years ago

jamessdixon commented 3 years ago

Chris:

I am trying to use Canopy to scrape some web pages - pass in a uri and getting the html document from a page (after the javascript runs)

I opened .fsi and wrote this code:

r "nuget: Canopy"

r "nuget: Selenium.WebDriver.ChromeDriver"

open canopy.classic open canopy.configuration open canopy.runner.classic

//canopy.configuration.chromeDir <- System.AppContext.BaseDirectory canopy.configuration.chromeDir <- executingDir canopy.configuration.elementTimeout <- 3.0 canopy.configuration.pageTimeout <- 3.0 let reporter = canopy.reporters.LiveHtmlReporter(Chrome, canopy.configuration.chromeDir) :> canopy.reporters.IReporter canopy.classic.start(canopy.types.BrowserStartMode.Chrome, reporter) let url = "https://finance.yahoo.com/quote/MSFT/press-releases" canopy.classic.url(url)

except IReporter is defined in the ConsoleReporter type? Am I even on the right track to do this?

lefthandedgoat commented 3 years ago

@jamessdixon

Since you are using f# you dont have to fully qualify all of your functions. The code snippet on this page will get you part way there:

http://lefthandedgoat.github.io/canopy/

configuration.chromeDir <- executingDir()
reporter <- new LiveHtmlReporter(Chrome, configuration.chromeDir) :> IReporter

start chrome

url "https://finance.yahoo.com/quote/MSFT/press-releases"

that should get you there, if you have any specific errors let me know

lefthandedgoat commented 3 years ago

Also here is another thread that may help you. https://github.com/lefthandedgoat/canopy/issues/457

amirrajan commented 3 years ago

The zip file here will help: https://github.com/lefthandedgoat/canopy/issues/385

It's really really old, but the structure should be about the same

jamessdixon commented 3 years ago

Thanks - I downloaded the Chromium Driver for my MacBook from here: http://chromedriver.storage.googleapis.com/index.html?path=88.0.4324.27/ http://chromedriver.storage.googleapis.com/index.html?path=88.0.4324.27/

I then ran it from the command line to get rid of the security prompt and then ran it from the F# script - worked like a champ

1 final question - how do I extract the html after the page load from the chrome type?

On Jan 3, 2021, at 12:55 PM, Amir Rajan notifications@github.com wrote:

The zip file here will help: #385 https://www.google.com/url?q=https://github.com/lefthandedgoat/canopy/issues/385&source=gmail-imap&ust=1610308529000000&usg=AOvVaw1_E33W-7TnflF_Th3Js4RO It's really really old, but the structure should be about the same

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://www.google.com/url?q=https://github.com/lefthandedgoat/canopy/issues/509%23issuecomment-753668026&source=gmail-imap&ust=1610308529000000&usg=AOvVaw3FIQKRjn2ozttmSn3jGg8s, or unsubscribe https://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAKUPVGPTZWV2RFY2SMOTYLSYDDTBANCNFSM4VSB6I5A&source=gmail-imap&ust=1610308529000000&usg=AOvVaw2VM2USBdQGoGwU3j6RXwDz.

lefthandedgoat commented 3 years ago

Something like

let marketCap = read "[data-test='MARKET_CAP-value'] span"

from: https://lefthandedgoat.github.io/canopy//Docs/actions.html

canopy lets you use xpath/css/jquery selectors by default. You can use $$("valid css selector") in chrome dev tools to test your selectors like the one above. Here is a handy cheat sheet: https://oscarotero.com/jquery/

You may be better off using an http api to get your stock data, scraping webpages is painful.

I was looking into this one recently: https://www.alphavantage.co/

jamessdixon commented 3 years ago

Yeah, I think you might be right I’ll check out alpha advantage

Thanks for your help - I think canopy is great

On Jan 3, 2021, at 4:43 PM, Chris Holt notifications@github.com wrote:

Something like

let marketCap = read "[data-test='MARKET_CAP-value'] span" from: https://lefthandedgoat.github.io/canopy//Docs/actions.html https://www.google.com/url?q=https://lefthandedgoat.github.io/canopy//Docs/actions.html&source=gmail-imap&ust=1610322208000000&usg=AOvVaw2TdEhHSACjk0KNyWLAZl3r canopy lets you use xpath/css/jquery selectors by default. You can use $$("valid css selector") in chrome dev tools to test your selectors like the one above.

You may be better off using an http api to get your stock data, scraping webpages is painful.

I was looking into this one recently: https://www.alphavantage.co/ https://www.google.com/url?q=https://www.alphavantage.co/&source=gmail-imap&ust=1610322208000000&usg=AOvVaw0YtfEBMT9IxN7REJ0wX24i — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://www.google.com/url?q=https://github.com/lefthandedgoat/canopy/issues/509%23issuecomment-753693833&source=gmail-imap&ust=1610322208000000&usg=AOvVaw2fn4FKMMKDvFqNI68V3R04, or unsubscribe https://www.google.com/url?q=https://github.com/notifications/unsubscribe-auth/AAKUPVDTIQRBVVIYC7YAIVDSYD6J5ANCNFSM4VSB6I5A&source=gmail-imap&ust=1610322208000000&usg=AOvVaw22hFUUE9MiFCOTBkusZu2R.

amirrajan commented 3 years ago

this might help you with http stuff: https://github.com/amirrajan/exploring-fsharp/blob/master/002/002.fsx