VBA-tools / VBA-Web

VBA-Web: Connect VBA, Excel, Access, and Office for Windows and Mac to web services and the web
http://vba-tools.github.io/VBA-Web/
MIT License
2.01k stars 494 forks source link

How to handle binary download from an http get #398

Open joekane101 opened 5 years ago

joekane101 commented 5 years ago

Short of the ongoing topic of binary upload and download, I am trying to download some files (and save as binary files) from an http get. By setting the webformat to PlainText, and then saving the response.body to a file via adodb.stream, this works most of the time. JPEG files have been mostly working well, but for some other types I am seeing the error from Webresponse.CreateFromHTTP:

ERROR - WebResponse.CreateFromHttp: -2147210474 (11030 / 80042b16), An error occurred while creating response from http -2147023783 (80070459): No mapping for the Unicode character exists in the target multi-byte code page.

This is from the same comment that Tim mentioned here on 110. https://github.com/VBA-tools/VBA-Web/issues/110#issuecomment-102888615

The raw binary from some files chokes the line: Me.Content = Http.ResponseText

but CreateFromHTTP doesn't have context to know to skip the content assignment line.

I was trying to think of the easiest way to handle this. Looking at code, I suppose the recommendation might be to register custom converters, but I dont necessarily know the MIME types (all of them, anyway) in advance, and some get complicated (office files, for example).

I know that I'm asking for a binary payload; I was initially thinking I could specify a WebFormat of RAW or Binary, which would provide some context to not parse the content, but not one of the options today.

Any suggestions?

zgrose commented 5 years ago

I don't have an answer for you that uses the WebRequest/WebResponse objects, but I can tell you how I download files with the MSXML2.XMLHTTP60 object and you can try to adapt for this project.

1- send your GET/POST/whatever 2- get the filename from the Content-Disposition header (if needed) 3- use this "hack" to get a byte array from the response

Dim xhr As MSXML2.XMLHTTP60
Dim fileBytes() As Byte
Dim filePath As String
...snip opening and sending request...
fileBytes = xhr.responseBody
Open filePath For Binary As #1
Put #1, , fileBytes
Close #1

At least that is what works for me when my ASP.NET and ASP.NET Core sites send files down.

joekane101 commented 5 years ago

Thanks for that. I think my implementation is essentially the same, and if (for this call) I comment out the line: Me.Content = Http.ResponseText Then the binary data is saved to file just fine. So its just the content parse I need to avoid (but only for this call).

Since I'm a ton of other calls in the webrequest/webresponse in this section, I was just trying to find a cleaner way without some hacking of createfromHTTP. But I guess the first hack is the setting of webformat to PlainText. Maybe that makes the case for a webformat of binary/raw, and then createfromhttp could have context to not try to parse the content.

But the 'clean' library changes for the above are probably a bit too sophisticated for me.

joekane101 commented 5 years ago

For the interim, I added a WebFormat enum value of 'Raw'. I use that type for the response format, and then added some checks for Raw type to skip trying to parse content. Then I just save off the body (binary bytes) as a file. Works.

timhall commented 5 years ago

Seems like a good approach, mind sending a PR if you can?

joekane101 commented 5 years ago

Afraid that I am somewhat GIT challenged at the moment, but:

In File WebHelpers.Base '' ' @property WebFormat ' @param PlainText ' @param Json ' @param FormUrlEncoded ' @param Xml ' @param Custom 'JK patch:05/08/19 '@param Raw 'end patch ' @default PlainText '' Public Enum WebFormat PlainText = 0 json = 1 FormUrlEncoded = 2 Xml = 3 'JK patch:05/08/19 Raw = 4 'end patch Custom = 9 End Enum In File WebResponse.cls In subroutine CreateFromHttp:

'JK patch 05/08/19 'Me.Content = Http.ResponseText If Request.ResponseFormat <> WebFormat.Raw Then Content = Http.ResponseText 'end patch

in subroutine web_LoadValues ` 'JK patch 05/08/19

' If web_Request.ResponseFormat <> WebFormat.PlainText Then If webRequest.ResponseFormat <> WebFormat.PlainText Or web_Request.ResponseFormat <> WebFormat.Raw Then 'end JK patch`