thecodrr / fdir

⚡ The fastest directory crawler & globbing library for NodeJS. Crawls 1m files in < 1s
https://thecodrr.github.io/fdir/
MIT License
1.51k stars 58 forks source link

Feature request: Multiple filters joined via AND #35

Closed kwaping closed 3 years ago

kwaping commented 4 years ago

I would like to use a chain of multiple .filter() operations, but have them successively refine the results. For example, I would like to find files where:

  1. extension is .html
  2. the path contains /src/
  3. the path contains /trunk/
  4. the filename does not contain test

I would like to implement that via this code:

const crawler = new fdir()
  .withBasePath()
  .filter(path => path.match(/\.html/))
  .filter(path => path.contains('/src/'))
  .filter(path => path.contains('/trunk/'))
  .filter(path => !path.contains('test'));

The current implementation of multi-filters treats this like an OR, which gives me many undesired results.

Thank you for considering my request!

thecodrr commented 4 years ago

Okay, I knew this would occur sooner or later.

We have two options here:

  1. We change the API to treat multi-filters as AND conditional (not as OR).
  2. We add a second parameter to the .filter method that defines whether it is AND or OR.

Which one makes the most sense? Should filters be, by default, AND conditioned?

kwaping commented 4 years ago

I am not excited about breaking the existing API, but I think it's the right thing to do. The reason is that chained .filter calls in vanilla Javascript act in a cascading AND manner. You take the list, filter it, return the new list - then take that new list, filter it, return it - etc.

Besides this being what I would consider the expected behavior, there's also a lot less that can go wrong with this method. With the OR default, you can accidentally select more than you wanted - potentially a LOT more, causing speed and resource issues with your program. With the AND method, you might accidentally select too little, but that would have much less of a negative impact on your program.

I did work around the issue by just grabbing all the files and applying the cascading filters to the resulting array. The workaround for the current method would be to use pipes in a regex match. In terms of workarounds, leaving it as-is is superior. In that case, if you switch to AND by default, I suggest you add a flag for enabling OR mode.

EDIT: I forgot to add - thank you for your quick response and for this excellent piece of software!

thecodrr commented 4 years ago

I am not excited about breaking the existing API, but I think it's the right thing to do. The reason is that chained .filter calls in vanilla Javascript act in a cascading AND manner. You take the list, filter it, return the new list - then take that new list, filter it, return it - etc.

Yes but the filter of fdir works a little bit differently; it collects all the filters and then applies them all via filters.some. Since, there's no cascading, does it still make sense?

This line handles most of it: if (filters.some((f) => f(filename))) files.push(filename);

To change it, all that has to be done is to swap .some with .every, I suppose.

In effect, this is just a convenient wrapper. You could easily just use a single .filter and put all the conditions in one (it might even be slightly more performant).

In terms of workarounds, leaving it as-is is superior. In that case, if you switch to AND by default, I suggest you add a flag for enabling OR mode.

I suppose we can add an option to toggle OR/AND mode.

I forgot to add - thank you for your quick response and for this excellent piece of software!

No problem, glad to have helped.