Closed GoogleCodeExporter closed 9 years ago
This is again related to issue 41.
Original comment by ariya.hi...@gmail.com
on 28 Feb 2011 at 11:51
Issue 92 has been merged into this issue.
Original comment by roejame...@gmail.com
on 26 Apr 2011 at 5:52
I'm trying to implement this functionality and not making much progress. Using
the attached patch, I run:
$ bin/phantomjs examples/download.js
and get this output:
WebPage instantiated
WebPage instantiated
Download complete - fail
<html><head></head><body></body></html>
I added cout of "WebPage instantiated" (to verify my debug messages work as
expected). I also added a cout in my downloadRequested slot. That one did not
get displayed. Can someone spot what I'm doing wrong or let me know if I'm on
the completely wrong track?
Here is where I found out about the downloadRequested signal:
http://doc.qt.nokia.com/latest/qwebpage.html#downloadRequested
Original comment by brian.th...@gmail.com
on 23 Jun 2011 at 3:13
Attachments:
Whoops, here is the patch file attachment without the ANSI color codes
Original comment by brian.th...@gmail.com
on 23 Jun 2011 at 3:16
Attachments:
Any progress on this issue?
Original comment by nperria...@gmail.com
on 16 Aug 2011 at 3:05
No progress as of now.
Original comment by ariya.hi...@gmail.com
on 16 Aug 2011 at 4:56
A friend of mine (http://svay.com/) just told me a nice trick for dealing
around with this issue, using XHR within the page environment and base64
encoding to retrieve file contents and it works rather great. For the record
you can find an example here: http://jsfiddle.net/3kUXy/
Original comment by nperria...@gmail.com
on 16 Aug 2011 at 4:59
The URL to the file is not always known so XHR is not a general solution. For
instance, if you are downloading a utility/bank/cc statement, you may have to
click a link which will possibly execute some JS code and trigger another page
load with a frame embedding the PDF. Or the statement comes in as an
attachment.
What will it take to support the file download feature?
Requirement: Download files that come in embedded in the page/frame or as
attachments. The URLs may or may not be known. Allow saving the files to the
file system or "upload" them to a web server (so the server can save the files
in a DB for instance).
Original comment by gopiredd...@gmail.com
on 27 Jul 2012 at 8:16
I've got an early but functional version of this at
https://github.com/woodwardjd/phantomjs/tree/add_download_capabilities
Example:
var page = require('webpage').create();
page.onUnsupportedContentReceived = function(data) {
console.log('Got a download at url: ' + data.url);
page.saveUnsupportedContent('some.file.path', data.id);
phantom.exit();
}
page.open('http://some.pdf.url.com/some.pdf');
I call this "early but functional" because it works where I've tested it
(linux, PDF downloads), but has a likely small memory leak, and I'm not 100%
convinced the callback mechanism I used is idea.
Comments desired.
Original comment by ja...@recovend.com
on 10 Aug 2012 at 6:21
I've downloaded and built the git for above, but I can't seem to get the
onUnsupportedContentReceived event to fire and calling saveUnsupportedContent
throws an undefined error. Are there special build steps required to enable it?
Thanks,
Robert
Original comment by rotava...@gmail.com
on 1 Sep 2012 at 4:21
No special build steps required, as far as I know. If
saveUnsupportedContent is undefined, maybe you haven't built the version in
the add_download_capabilities branch (git checkout
add_download_capabilities after the git clone)? Just speculating.
Original comment by ja...@recovend.com
on 4 Sep 2012 at 2:44
I second the XHR+base64 method. It takes another 50+ lines of code to send to
page.evaluate(), and I have to de-base64 the content afterward, and that's
basically how CasperJS does it (as far as I can tell from their code—they do
a lot of weird (unnecessary, in my book) binding with window.__utils__ in the
page context).
I used this one (first answer):
http://stackoverflow.com/questions/7370943/retrieving-binary-file-content-using-
javascript-base64-encode-it-and-reverse-de
It works great. Just be sure to try-catch the call to base64ArrayBuffer(),
because Uint8Array(arrayBuffer) may throw an error, and check
xhr.getHeader('content-type') == 'application/pdf' if you're doing pdf
downloads like I was.
Original comment by audi...@gmail.com
on 4 Sep 2012 at 3:24
I need this as well. Can't use the XHR method because the inline attachments I
need to scrape don't come with a URL I can hit.
Original comment by subel...@gmail.com
on 4 Oct 2012 at 11:03
Wouldn't inline attachments be even more easily downloaded? For an image:
var content = page.evaluate(function() {
return $('img#whatever').attr('src');
});
fs.write(yer_path, content, 'w');
---
Ariya, can you give some estimate of how long this feature (downloading a url)
would take to implement? I'd love to get involved in PhantomJS development, but
maybe this issue is a lot trickier than it sounds?
Original comment by audi...@gmail.com
on 4 Oct 2012 at 1:59
Sorry, I didn't mean to write "inline". The file I need is not an image and is
not part of the DOM. It gets sent as a result of a POST with the
Content-Disposition header 'attachment;filename="report.csv"'
Original comment by subel...@gmail.com
on 5 Oct 2012 at 10:51
Hi there. I think the base64-encoding solution can only be a stop-gap solution.
- Downloading big files will probably exhaust memory and base64 encoding and
-decoding it will use up resources that would have better been spent elsewhere
- therefore we want to have the option to redirect a downloaded stream to file
- We may have pages where we cannot control the loading of a file that is not
supported (e.g. PDF)
- We may want to save resources that have already been loaded as part of the
page (e.g. images)
I think the optimal solution would be to add functionality to the
onResourceReceived hook to allow setting up a "redirection" handler, and if
such a handler is set, unsupported file formats should silently be downloaded.
This handler could then have another onDownloadFinished hook to resume
operation once the download is done.
Original comment by bogusan...@gmail.com
on 20 Nov 2012 at 4:48
Original comment by james.m....@gmail.com
on 12 Jan 2013 at 4:33
Closing. This issue has been moved to GitHub:
https://github.com/ariya/phantomjs/issues/10052
Original comment by james.m....@gmail.com
on 16 Mar 2013 at 12:17
Original issue reported on code.google.com by
alexsa...@gmail.com
on 28 Feb 2011 at 10:15