sindresorhus / file-type

Detect the file type of a file, stream, or data
MIT License
3.64k stars 345 forks source link

Add support for reading from a WebStreams #635

Closed Borewit closed 2 months ago

Borewit commented 2 months ago

Changes:

Moves remaining native Node Readable streams to the Node.js specific entry point (index.js). As the fileTypeStream is based on the the native Node Readable stream, that function is no longer available in the primary (other then Node engine) entry point (core.js).

:warning: fromBlob() requires Node.js ≥ 20 as the strtok3.fromStream requires a Readable from which a ReadableStreamBYOBReader can be created. Which is only available since Node.js 20.

Updating dependencies to:

graph TD;
    FTN("file-type (Node.js entry point)")-->FTP
    FTP("file-type (primary entry point)")-->S
    S(strtok3)-->P(peek-readable)
    S(strtok3)-->TO("@tokenizer/token")
    TY(token-types)-->TO
    TY-->IE("ieee754")
    FTP-->TY
    NS("node:stream")
    FTN-->NS
    FTP-->UAE(uint8array-extras)
    style NS fill:#F88,stroke:#A44
    style IE fill:#CCC,stroke:#888
    style FTN fill:#FAA,stroke:#A44

Previous dependency diagram can be found here: https://github.com/sindresorhus/file-type/issues/578#issuecomment-1789138382

sindresorhus commented 2 months ago

I would prefer to have a single fromStream method that supports both.

Borewit commented 2 months ago

I would prefer to have a single fromStream method that supports both.

That was quick. I will give that some thought, I like to release this one together #633, as joined effort to work without Node Streams and Buffer.

Borewit commented 2 months ago

I would prefer to have a single fromStream method that supports both.

@sindresorhus, are you not concerned that this make it more difficult to for tree-shaking to exclude node:buffer and nodes:stream when these are effectively not being used?

sindresorhus commented 2 months ago

I was thinking that fromStream in core.js would only support web streams, same with the one in browser.js, but in index.js it would get the additional support for Node.js streams. That way, only Node.js users gets the Node.js imports.

sindresorhus commented 2 months ago

Alternatively, we could just drop support for Node.js streams, and just tell users to call .toWeb() on their Node.js streams. That may be even better actually.

sindresorhus commented 2 months ago

Alternatively, we could just drop support for Node.js streams, and just tell users to call .toWeb() on their Node.js streams. That may be even better actually.

Let's go with this.

Borewit commented 2 months ago

Alternatively, we could just drop support for Node.js streams, and just tell users to call .toWeb() on their Node.js streams. That may be even better actually.

Let's go with this.

The [toWeb() method] (https://nodejs.org/api/stream.html#streamreadabletowebstreamreadable) is still experimental.

Maybe first release with as a hybrid solution, including Node.js & Web Streams. And indeed, moving the Node.js dependencies to the Node.js specific entry point index.js . Also note that in addition to the fileTypeFromStream(), there is the fileStream() which is still depends on Node.js streams, and also shared to with the primary entry point. I am honestly not very motivated to update that method, as it not complaint with the current architecture where file-type has access to the entire file. The argument for backward compatibility is now fading, as Web Streams are now becoming widely available.

sindresorhus commented 2 months ago

Maybe first release with as a hybrid solution, including Node.js & Web Streams. And indeed, moving the Node.js dependencies to the Node.js specific entry point index.js .

Ok, sure. But at least add a TODO comment to remove Node.js streams support and recommend toWeb() instead when it's stable.

sindresorhus commented 2 months ago

there is the fileStream() which is still depends on Node.js streams

It could be moved to index.js as it's already documented to only be available on Node.js.

Borewit commented 2 months ago

there is the fileStream() which is still depends on Node.js streams

It could be moved to index.js as it's already documented to only be available on Node.js.

That is very doable. We can raise an issue, to implement the same concept using Web Streams. I actually starting working on that, but it takes some time. Tricky stuff those streams.

Update, raised issue: #636 to implementing detecting web stream

sindresorhus commented 2 months ago

@Borewit Are we ready to do another release now or do you plan more changes? None of these changes are technically breaking, right? Since Buffer is also a Uint8Array, so https://github.com/sindresorhus/file-type/commit/00e051bceaf0791ffc1b08b36aee196ccdd95606 should not be breaking (it's only breaking if we would have changed a return value from Buffer to Uint8Array).

Borewit commented 2 months ago

@Borewit Are we ready to do another release now or do you plan more changes? None of these changes are technically breaking, right? Since Buffer is also a Uint8Array, so 00e051b should not be breaking (it's only breaking if we would have changed a return value from Buffer to Uint8Array).

This morning I realized, what could be documented better is the NodeFileTypeParser class

https://github.com/sindresorhus/file-type/blob/b815b5e173dd9f521311615eb33e955181030ca9/index.js#L9

which should be used instead of the FileTypeParser, for those implementing custom extensions and usign Node.js.

No objection to release now.

Borewit commented 1 month ago

Part of v19.1.0