Closed Eji1700 closed 4 years ago
Yes you need auth tokens/cookies for you to access the site, thats why it works in a browser but not the type provider.
For save as you want to save the Html of a page? This should work, https://stackoverflow.com/questions/40882331/save-generated-html-using-canopy
Hmm close but seems to have slightly different behavior as it's including all the html tags instead of just giving me the xml. as say, view source would.
Still it hadn't clicked that I could just be running js. I'll see if i can figure out from there.
Actually looking closer it's giving me both the xml version of the page, and the html version right below it, so maybe i can work with this.
Solved.
".pretty-print" |> element |> read gets you the xml, and then you can do what you want with it. In this case i'm shoving it into my pre defined type provided type and manipulating from there.
Use case: I'm logging into one website, then logging into 5+ sub websites, then running reports on each to get xml files.
The reports are "run" by url manipulation, so often it's something like
"url.com/reports/thing.xml?params"
The first problem i'm hitting is that the non automated workflow has you getting the xml report by using right click, save as, and then just saving the page as an xml file. I've got a ruby script that replicates this with mechanize (which has a save as function), but I haven't found any way to do this with Canopy. There was an issue similar to this with pdfs, but in that case there was a button for them to click on to trigger the save.
The second is that i'm only saving these files so that I can loop through them with an xml type provider and shove values from them in a database. I'd love to be able to just pass the url to the type provider (which already has a sample file saved on disk to reference) and go from there, but i don't know of any way to pass my logged in instance of the url. Naturally if i just put the url in, it sees the "you need to log in" page and fails.