Open Cmeesh11 opened 2 weeks ago
Thanks! I think I see issue in the dom type serializer layer. Btoa apparently only works with ascii characters.
Exactly, and I believe pdf files are all going to have non ascii characters, so makes sense!
Well, then I went into code and we're doing this already :
const binary = Array.from(new Uint8Array(value.buffer, value.byteOffset, value.byteLength)) .map(byte => String.fromCharCode(byte)) .join('');
I think we had to move away from the String.fromCharCode(..byte) because the varargs breaks at some size of binary. Need to figure out what to replace this with.
If the varargs limit is an issue, I believe TextDecoder works pretty well and is designed to handle larger arrays. You could do something like:
const binary = new Uint8Array(value.buffer, value.byteOffset, value.byteLength);
const decodedString = new TextDecoder('utf-8').decode(binary);
But just a suggestion.
Good suggestion. I was exploring that as well. I think (you might try to modify your node_modules to check) that your binary is encoded in latin1. If that's the case, this will probably break your encoding, so you'd end up with something weird like:
let decodedString;
try {
// Attempt to decode using UTF-8
decodedString = new TextDecoder('utf-8').decode(binary);
} catch (e) {
// Fallback to Latin-1 if UTF-8 decoding fails
decodedString = new TextDecoder('latin1').decode(binary);
}
I think we could probably also avoid variadic by doing
const dataArray = Array.from(new Uint8Array(value.buffer, value.byteOffset, value.byteLength));
const binary = String.fromCharCode.apply(null, dataArray);
This would be in TypeSerializer in @ulixee/commons in node modules
Realized I hadn't checked in a fix for this to the commons project. It's in there now if you want to try it out. Or you can wait for next release
Using
hero.fetch
, I'm able to properly retrieve a pdf file from a site. I want to convert this to a buffer so I can save it, but I get this error every time:Here is the request I'm making: