terkelg / tiny-glob

Super tiny and ~350% faster alternative to node-glob
MIT License
855 stars 25 forks source link

Iterator Support for large globs #30

Open rijnhard opened 5 years ago

rijnhard commented 5 years ago

Hi

Consider a method where the return object could be an Iterator or AsyncIterator so that large file globs (as in huge numbers of files) are supported.

terkelg commented 5 years ago

Is this to avoid blocking the main thread and what would such implementation look like?

rijnhard commented 5 years ago

Theres a few things involved here @terkelg

There are better implementations then what I did, this just so happened to have been fine for my use case. In my local implementations, I used fast-glob and adapted a stream into an async iterator using the stream-to-async-iterator library.

Concerns:

Notes:

Usage:

import fglob from 'fast-glob';
import StreamIteratorAdapter from 'stream-to-async-iterator';

async function process(dirglob ,globOptions) {
    const stream = fglob.stream(dirglob, globOptions),
        iterator = new StreamIteratorAdapter(stream);

    for await (const stat of iterator) {
        // processes items individually allowing us to handle massive glob lists without hitting resource limits
    }
}
terkelg commented 5 years ago

Thanks for elaborating. This seems a bit complex. Is it possible add as an extension/wrapper around tiny-glob?

rijnhard commented 5 years ago

If you can provide a stream option it will allow this use case, and with time it will get more elegant. We can't do any higher level iteration if we don't have some async way of processing entries with backpressure.

Usually from the implementations I've seen this comes back to readdir, there are some packages that provide this (like fast-glob) and digging through their code it looks like they use readdir-enhanced which via some magic (I didn't look into that code) manages to provide a stream.

But to make it easier for you I'd wrap the stream and just use and expose an async iterator via generator functions, otherwise you have to do a bunch of stream handling and thats error prone and painful.

On Tue, 6 Nov 2018, 11:56 Terkel, notifications@github.com wrote:

Thanks for elaborating. This seems a bit complex. Is it possible add as an extension/wrapper around tiny-glob?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/terkelg/tiny-glob/issues/30#issuecomment-436195302, or mute the thread https://github.com/notifications/unsubscribe-auth/AEF36NWuATG6PdCdALz5p-qaXMmirQ1iks5usVzagaJpZM4XXdvf .