Open jimmywarting opened 3 years ago
HI,bro, I think you mentioned two points:
ESM
, it seems not too difficult. I will submit a PR this week if I have time;TextDecoder
, I don't think its compatibility is very good, it need the full ICU data.
Maybe we can introduce TextDecoder on the existing code? @ashtuchkin just though of something else. What if instead of using node:streams
you used (async) iterators instead
node:streams
node:stream
in nodejs then you could just use stream.Readable.from(iterable).pipeTo(...)
// function * decodeIterableSync(iterable) { yield transformation(...) }
// async function * decodeIterable(iterable) { yield transformation(...) }
// returns a new iterable (generator) const iterableTransformer = iconv.decodeIterableSync(iterable) for (let str of iterableTransformer) result += str
str = [...iconv.decodeIterableSync(iterable)].join('')
// returns a new async iterable (async generator) const iterableTransformer = iconv.decodeIterable(fs.createReadStream('./file.txt')) for await (let str of iterableTransformer) result += str
stream.Readable.from(iterableTransformer).pipeTo(dest)
think this would be best for cross compatibility
hi,i upgraded it ro es6 @ashtuchkin
Hey first of all thanks for the feature requests. Let me address them one by one.
ES Modules - correct me if I'm wrong, but this is not required for cross-compatibility between Node, Browser and Deno, is it? You could use regular CommonJS in browser for quite some time using Browserify or similar libraries. Deno also seems to support it. Maybe it's not as ergonomic as ES Modules, sure, but it seems more like a tradeoff we need to consider, with supporting older Node versions on the other end (specifically versions less than v15.3.0, v12.22.0, which are pretty recent afaik). I'll definitely keep that in mind, but I'm not sure I'm ready to do this leap at this point.
Buffer -> UInt8Array. This one I agree. It was also requested by vscode team. I'm working on it in next
branch, but it's a bit slow. It'll get there at some point.
TextDecoder. This one is a bit tricky. We can probably check if it exists in the environment and use it for performance, but that would require some refactoring to avoid cluttering the codebase. I'll keep that in mind as an perf optimization. We definitely can't fully switch to it because 1) it doesn't support encoding (and TextEncoder is kinda useless) and 2) the set of encodings it supports is much smaller than iconv-lite.
Node streams -> Async iterators. This is interesting, I haven't seen it done that way. It will be easier to do with the concept of backends that I introduced in the next
branch, so that Browser and Deno don't require streams at all. I'll keep that in mind too.
You could use regular CommonJS in browser for quite some time using Browserify or similar libraries ... Deno also seems to support it. Maybe it's not as ergonomic as ES Modules
yea, but that requires compilers and you can't use a CDN, it needs lots of extra works for tiny projects, ESM is the feature whether or not you like it... it is the feature of how to load everything in Deno, browser and with node's now new HTTP loader
ES Modules can still be loaded from cjs if you need it to be. But the tradeof is that it needs to be loaded using async with import('iconv-lite').then(...)
this can work from commonjs...
I read some documents, I think I understand the reason for using Buffer in iconv-lite,that Buffer instances can be converted to and from normal JavaScript strings. But Buffer seems to work only in NodeJS,and the Uint8Array's Browser compatibility is much better. Although they can be interchangeable, like this:
// From Buffer to ArrayBuffer:
var buf = Buffer.from("hello world!")
// var toUint = buf.buffer
// Slice (copy) its segment of the underlying ArrayBuffer
let ab = buf.buffer.slice(buf.byteOffset, buf.byteOffset + buf.byteLength);
// From ArrayBuffer to Buffer:
var buffer = Buffer.from( new Uint8Array(ab) );
But the UInt8Array is enough. I think the UInt8Array should be used like this: get the unicode code for each word, and then fill the Uin8Array in according to the UTF-8 encoding.... uh...🤣I created a demo,I'm not sure we can do that, Maybe you can take a look at it.
Node's Buffer is currently a subclass of UInt8Array, so we can use Buffer in all places that accept UInt8Array (liskov substitution principle). Even more, iconv-lite also does the reverse - it accepts UInt8Array in the same places where it accepts Buffer, e.g. in the decode() function. The problem is .encode() function. It currently returns Buffer, which has a richer API than just UInt8Array and any code that depends on this will break if we start returning UInt8Array instead. That is a breaking change of our core API, which would require a major version bump. That will also force us to stop supporting older Node versions where Buffer was not a subclass of UInt8Array (before v3.0 afaik).
On Fri, Sep 10, 2021 at 11:09 PM yosion-p @.***> wrote:
I read some documents, I think I understand the reason for using Buffer in iconv-lite,that Buffer instances can be converted to and from normal JavaScript strings. But Buffer seems to work only in NodeJS,and the Uint8Array's Browser compatibility is much better. Although they can be interchangeable, like this:
// From Buffer to ArrayBuffer: var buf = Buffer.from("hello world!") // var toUint = buf.buffer // Slice (copy) its segment of the underlying ArrayBuffer let ab = buf.buffer.slice(buf.byteOffset, buf.byteOffset + buf.byteLength);
// From ArrayBuffer to Buffer: var buffer = Buffer.from( new Uint8Array(ab) );
But the UInt8Array is enough. I think the UInt8Array should be used like this: get the unicode code for each word, and then fill the Uin8Array in according to the UTF-8 encoding.... uh...🤣I created a demo https://github.com/yosion-p/unint8array-demo,I'm not sure we can do that, Maybe you can take a look at it.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/273#issuecomment-917327620, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHKL6UOCKKFNUCB3DRDUBLB7TANCNFSM5B3TCWNA .
before v3.0 afaik
Yes, According to official docs:
| v3.0 | [ The Buffers class now inherits from Uint8Array.](https://nodejs.org/docs/latest-v15.x/api/buffer.html#buffer_buffers_and_typedarrays) -- | --So, I don't think we need to be compatible with anything prior to Node3.0 in the future. You know Encode () and decode() looks not much code, but the logic referenced inside is a bit complicated.
We are currently compatible with v0.10+, but if we change to using UInt8Array then we'll have to up it.
On Sat, Sep 11, 2021, 05:44 yosion-p @.***> wrote:
before v3.0 afaik
Yes, According to official docs: v3.0 The Buffers class now inherits from Uint8Array. https://nodejs.org/docs/latest-v15.x/api/buffer.html#buffer_buffers_and_typedarrays
So, are we no longer compatible with node3.0 pre-release?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ashtuchkin/iconv-lite/issues/273#issuecomment-917377132, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEZKHOBBANY6EHTZF3RQ2LUBMQGRANCNFSM5B3TCWNA .
ppl are starting to use import instead it's more cross compatible for both browsers, Deno and Node's new experimental http import so you won't depend on npm
lots of folks want stuff to be more cross compatible between different environment. That means using import and ditching Buffer for simple Uint8Array instead.
Also how about switching to using TextDecoder instead? - it's available in Deno, browser and node globally