I had some issues with Browser.retrieve and original filenames, at least in Python 2.6:
Browser.retrieve(someurl) returns a (tmp_filename, httplib.HTTPMessage), with a temporary filename from tempfile.mkstemp;
Browser.retrieve(someurl, filename) returns a (filename, httplib.HTTPMessage);
but there's no way tho get the original filename, even if it's present in the 'Content-disposition: attachment; filename="abcd.xyz"' httplib.HTTPMessage header.
That's not really mechanize's fault: to extract those header parameters, httplib.HTTPMessage is missing a crucial 'get_filename' or a more generic 'get_param' methods, that are both present in the email.message.Message class. httplib.HTTPMessage has indeed a 'getparam' method, but unfortunately, it's only used/usable for 'content-type' header parsing.
I submitted an issue on the Python tracker (http://bugs.python.org/issue11316) and proposed a 'monkeypatch_http_message' decorator as a workaround, so we can do:
import mechanize
from some.module import monkeypatch_http_message
browser = mechanize.Browser()
(tmp_filename, headers) = browser.retrieve(someurl)
# monkeypatch the httplib.HTTPMessage instance
monkeypatch_http_message(headers)
# yeah... my original filename, finally
filename = headers.get_filename()
Once again, that's the situation in Python 2.6. According to http://bugs.python.org/issue4773, httplib.HTTPMessage in Python 3.x is using email.message.Message underneath.
(ps: this is an edited repost of issue 35, that I closed by mistake...)
I had some issues with Browser.retrieve and original filenames, at least in Python 2.6:
That's not really mechanize's fault: to extract those header parameters, httplib.HTTPMessage is missing a crucial 'get_filename' or a more generic 'get_param' methods, that are both present in the email.message.Message class. httplib.HTTPMessage has indeed a 'getparam' method, but unfortunately, it's only used/usable for 'content-type' header parsing.
I submitted an issue on the Python tracker (http://bugs.python.org/issue11316) and proposed a 'monkeypatch_http_message' decorator as a workaround, so we can do:
Once again, that's the situation in Python 2.6. According to http://bugs.python.org/issue4773, httplib.HTTPMessage in Python 3.x is using email.message.Message underneath.
(ps: this is an edited repost of issue 35, that I closed by mistake...)