micromatch / glob-fs

file globbing for node.js. speedy and powerful alternative to node-glob. This library is experimental and does not work on windows!
http://jonschlinkert.github.io/glob-fs
MIT License
55 stars 17 forks source link

first pass #2

Open jonschlinkert opened 9 years ago

jonschlinkert commented 9 years ago

@doowb / @tunnckoCore

I just pushed up a first pass at this. there is still a lot of work to be done but I think the basic conventions are a good start. I'd love feedback on the following:

any other thoughts would be great!

jonschlinkert commented 9 years ago

@doowb you've already seen some of this, thanks for the feedback and help with fleshing out some details!

tunnckoCore commented 9 years ago

oo yeah, again you pushed something before me ;d that semester session, nah. I had plans to rewrite the whole stack around and because that i fork some libs in vinyljs org and organize the process in my mind.

but yea, i'll review it later! thanks for mention.

tunnckoCore commented 9 years ago

@jonschlinkert mm okey, looks great and thanks for the push up.

I decided to use a file object in glob-fs, since it really simplifies options handling, etc.

:+1:

that's my thoughts

  1. Why not emit file on iteratorStream and iteratorAsync? hm..
  2. Why not use parse-glob instead of custom? im curios, it would provide more cool things
  3. Using through().obj() is very slow and main perf hit on glob-stream - i tested it, that's why i create custom readable stream in my glob-fs first-shots, but it's ugly - i have something locally.
  4. graceful-fs, included, but not used
  5. better and faster event emitter?
  6. Negations. It would be easy if we think for near future - support them only through options like we commented before few months, and because node-glob v6 deprecation announcement. I rethink the whole #1 thread before few weeks and got it.
  7. For externalizing middlewares.. names? glob-fs-exclude and *-include? We also should update them, cuz they currently if pattern is regex will fail (didn't look the tests yet). Why? Let say exclude.. if pattern is regex, it will pass it to isnt.js and it returns function that expect object, but it receives string file.path when returned function from exclude middleware is executed. Hm.. or im sleepy? huh, maybe, sorry if it is true.
  8. About multiple patterns support.. im a little bit confused about how to do them.
  9. use mz/fs instead? it will use graceful-fs if it is in dependencies.

Or We can combine iteratorAsync and iteratorPromise in one which would be (as i call them) - hybrid. Hybrids are cool thing, actually i love them, lol - because they are in right place in right time (migrating to es6 times). They always will return promise and always accept callback in same time (almost like thenify, thanks). So we have is-promise and is-hybrid lol. isPromise(hybrid) //=> true and isHybrid(promise) //=> false. So this will be possible (as seen in hybridify readme)

var glob = require('glob-fs')({gitignore: true})

glob.readdir('*.js', {foo: 'bar'}, function (err, files) {
  console.log(err || files)
})
.then(console.log)
.catch(console.error)

it would work both in same time or separately when only callback is given or when only then/catch. And this will drop readPromise api.


I can try to do new iteratorStream with my custom readable stream approach. which can be as standalone module.

tunnckoCore commented 9 years ago

@jonschlinkert are you thinking that module to pass some/all node-glob and glob-stream tests? I think it's impossible if we want to move forward to future. They are wrong by architecture and some of their tests are because these wrong steps and semi-fixes of some bugs or people wrong understandings.

Im passing 95% of the glob-stream tests and i cant believe, it's just POC. For other 5% I dont want to implement and do wrong things. Didn't try node-glob tests yet.

jonschlinkert commented 9 years ago

Why not emit file on iteratorStream and iteratorAsync? hm..

I thought I was on async, but @doowb and I were thinking it might not be necessary with stream. but maybe we should just do it to be consistent across the iterators... the fact that you brought it up after we wondered the same thing probably answers the question.

Using through().obj() is very slow

yeah we can optimize after we get the conventions down. I'm much more focused on making sure that we offer something innovative and uniquely valuable with this approach first

graceful-fs, included, but not used

still linting... there are actually a few files that aren't even used. some of this is mentioned on the TODO list

Negations. It would be easy if we think for near future

so, pattern matching in general has never been the hard part. it's deciding when to include or exclude that's hard. or rather, hard to make a convention around. I can elaborate when get a little further.

For externalizing middlewares.. names?

also mentioned on readme :) (no worries, there's a lot there, and it's not finished yet. I don't expect you to notice it all)

didn't look the tests yet)

that's easy, there aren't any here yet lol. I have a bunch but haven't copies them over to this project yet. I re-created the lib like 4 times and just didn't get to that yet

if pattern is regex, it will pass it to isnt.js...

See this section. Middlewares should always return the file object. Whatever the regex tests, typically you will either modify file.include or file.exclude so that the handler knows what to do with the file after it's returned from the middleware, and the next middleware will always get a file object.

About multiple patterns support...

no worries, I think I have a good way to do this... for now we should just make this work for a single pattern so the focus stays on middleware conventions. then we can fold in the multi-pattern support.

Also we can combine iteratorAsync and iteratorPromise

I'm somewhat of a purist, so I'd probably prefer just adding another iterator. @doowb and I talked about just allowing the user to register iterators like we do with loader-cache. that approach has been working well for us... either way it would be great to have you add an iterator. as your own lib is great, or as a pr. it's totally up to you. that said, if there are ideas for improving existing iterators, I'd love to see that as prs so we can optimize those.

are you thinking that module to pass some/all node-glob and glob-stream tests?

no. I agree this is not a good goal to have. I created this module explicitly to not have to support every edge case under the sun in the core module. middleware can handle edge cases, so those unit tests would be up to the implementors. we just need to figure out how much is enough and go from there...

as long as the iterators/handlers and inclusion/exclusion logic actually works, its trivial for a user to support any little feature by just defining an inline middleware, like:

var glob = require('glob-fs')({ gitignore: true })
  .use(function (file) {
    // do stuff
    return file;
  });
tunnckoCore commented 9 years ago

no. I agree this is not a good goal to have.

awesome. yea maybe it was stupid question, when we have "middleware" idea here. lol, damn im tired.

See this section. Middlewares should always return the file object.

yea, i read the readme 2 times, lol. And that's clear. The problem is that when pattern is regex isnt would return function (actually middleware) which is assigned to isMatch which later is called in the if statemant and cant act like mm.matcher. But okey, maybe im wrong lol.

and I talked about just allowing the user to register iterators like we do with loader-cache.

yea, definitely it would be great

jonschlinkert commented 9 years ago

yea maybe it was stupid question

no, not at all. any question is okay.

about the matching, yeah we need to optimize that

tunnckoCore commented 9 years ago

haha. Also it would be awesome if we use option-cache i felt in love with it these weeks, lol. it makes the things looks sooo nice and neat.

jonschlinkert commented 9 years ago

Awesome! Lol yeah I love that lib too.

tunnckoCore commented 9 years ago

Another thinking, when i'm looking latest updates to loader-cache... if we use it we be able to do crazy things like to set iterator-sync and use globby.sync lol. it's awesome, but the purpose of this lib maybe is a lil' bit another, currently, lol.

Hm.. or maybe better would be to be just api normalizer (i mean like consolidate, jstransformers and etc) and we will just implement more and more middlewares and iterators/loaders. So this module will just handle filepaths loading (loader-cache), middleware-ing (plugins, ware, bach or my upcoming awesome npm.im/benz) and cool option-cache api for controling options. So then everyone will just register loaders/iterators and middlewares, he even will be able to register node-glob or globby or glob-stream if he want.

I think the idea here in that "first pass" is great and the way is great, but also think we should concentrate in absolute core idea - matching, multiple support, performance, exclusion and inclusion and etc. Then we will just give it to glob-fs and boom, we are awesome LOL.

tunnckoCore commented 9 years ago

hah. exact what i was thinking


what about that?

// glob-fs (second shot ;d)

var glob = require('glob-fs')()
var opts = {
  dot: true,
  matchBase: true
  ignore: 'baz/foo.txt'
}

glob
  .enable('multiple')
  .enable('recurisve')
  // .option('type', 'stream')
  .option('type', 'async')
  .option('ignore', 'baz/qux1.txt')
  .option('ignore', 'baz/qux2.txt')
  .option('ignore', ['baz/s*.txt', 'foo/bar.js'])
  // will load some default iterators/loaders
  // which can be most used, and we just can focus on core
  // .use(globby)
  .on('error', console.log)
  .on('file', console.log)
  .on('dir', console.log)
  // it always will be just `.readdir` method
  .readdir(['foo/*.js', 'baz/*.txt'], opts, function (err, files) {
    // `err` should be always `null` here?
    // cuz it would be handled by central mechanism?
  })

in given fs tree

foo/aaa.js
foo/bbb.js
foo/bar.js
baz/qux1.tx
baz/qux2.tx
baz/sea.txt
baz/kkk.txt
baz/foo.txt

files would be foo/aaa.js, foo/bbb.js and baz/kkk.txt