sindresorhus / globby

User-friendly glob matching
MIT License
2.52k stars 130 forks source link

globby is b0rked on Windows: .sync nor .async deliver /any/ result. #155

Open GerHobbelt opened 3 years ago

GerHobbelt commented 3 years ago

Related to #152, but .sync is broken as well: doesn't matter what you try, globby won't produce any results.

Sample code:

var testset = globby.sync(__dirname + '/specs/*.jisonlex');
console.error({testset, dir: path.normalize(__dirname + '/specs/')});

produces this:

{
  testset: [],
  dir: 'W:\\Projects\\sites\\library.visyond.gov\\80\\lib\\tooling\\jison\\packages\\lex-parser\\tests\\specs\\'
}

while that directory is filled with tens of *.jisonlex files.

Ditto for the original code:

  var testset = globby.sync([
    __dirname + '/specs/*.jison',
    __dirname + '/specs/*.lex',
    __dirname + '/specs/*.jisonlex',
    __dirname + '/specs/*.json5',
    '!'+ __dirname + '/specs/*-ref.json5',
    __dirname + '/specs/*.js',
    __dirname + '/lex/*.jisonlex',
  ]);
console.error({testset, dir: __dirname});
papb commented 3 years ago

Hi! What NodeJS version? Did you try other versions?

GerHobbelt commented 3 years ago
$ node --version
v12.18.4

No, haven't tried other versions, but have seen this issue before. Then I switched tactics as I was in a hurry to get things done, so I don't expect this to be a node version specific thing.

GerHobbelt commented 3 years ago

Side Note: globby use like this (i.e. without directories in the search argument) works as expected:

globby([ 'ebnf-parser*.js' ])
GerHobbelt commented 3 years ago

New info

Worky 😥 (phew!)

This works on Windows:

process.chdir(__dirname);
var testset = globby.sync(['./specs/*.jisonlex', './lex/*.jisonlex']);

As long as you

Not Worky 1

Ergo: doing something like

let basepath = path.normalize(path.join(__dirname, ......)).replace(/\\\\/g, '/'); // UNIXify path
// ^^^ INSUFFICIENT, as the Windows drive letter plus colon, e.g. 'W:/' will make it through
// and then globby will barf a hairball, delivering NIL results!
globby(basepath + '/specs/*.jisonlex', ......);

will FAIL.

Not Worky 2

Ditto for any inattentive coding like this, where you employ path.join just to shut up eslint et al -- as would happen with the case above:

(note the trouble-causing path.join in the globby statement; compare to previous example)

let basepath = path.normalize(path.join(__dirname, ......)).replace(/\\/g, '/'); // UNIXify path
// ^^^ AS PREVIOUS EXAMPLE ABOVE.
globby(path.join(basepath, 'specs/*.jisonlex', ......));
// ^^^ BAD JUJU: path.join screws you over as it will inject a '\' and you will be TOAST, once again.

Conclusion

Only work with paths coding and globby when you're totally hopped up on caffeine and bright and shiny yourself. 👼

"Using relative UNIXy paths only" won't save your bacon because:

Be vewwy vewwy careful and on the look-out for hidden backslash-in-a-path delivery boys, such as:

phawxby commented 3 years ago

@GerHobbelt yes, adding this test to test.js fails on Windows but passes on Mac/Unix.

test('glob - async - absolute', async t => {
    const temporaryAbsolute = path.resolve(temporary);

    const result = (await globby(path.join(temporaryAbsolute, '*.tmp'))).sort();

    t.true(result.length === 5);
    for (const [i, element] of fixture.entries()) {
        t.true(result[i].endsWith(element));
    }
});

However, as a workaround you can do .replace(/\\/g, '//'). This test will pass.

test('glob - async - absolute', async t => {
    const temporaryAbsolute = path.resolve(temporary);

    const result = (await globby(path.join(temporaryAbsolute, '*.tmp').replace(/\\/g, '//'))).sort();

    t.true(result.length === 5);
    for (const [i, element] of fixture.entries()) {
        t.true(result[i].endsWith(element));
    }
});

Definitely some kind of bug in handle of full system paths combined with path.sep. cc. @sindresorhus

GerHobbelt commented 3 years ago

@phawxby: yup, did another test and the multiple-drive search scenario seems to pass as well.

🤔 What did go wrong before over here then as when I did the replace thingy it was not enough? 🤔

A-ha! I did it too early: before the path.normalize() calls like in

globby([
   path.normalize(p1.replace(/\\/g, '/')),
   path.normalize(p2.replace(/\\/g, '/')),
  ...
], ...

which, 🤦 of course 🤦 , would happily convert those UNIX slashes back to Windows' \ backslashes again and thus throw me a curveball. 😢 (I did not re-check the replace-only work-around after I found out path.normalize was having me on when I wrote the previous message and results therein. 😓 )

So the chdir+useRelativePathsOnly approach is not required, while my list of culprits to watch out for stands:

Them Nasty Buggers


Back on the topic of globby: what really fazes me now is the question how I got away with that __dirname-based stuff passing into globby. 🤔 Apparently I had some magic mix that I cannot bring back, as one thing's for sure: this was in a unit test rig which has run many times before (at least in 2018/2019).

What changed since that stuff got run last time?

Over time...

And after all these updates, said test rig went t*ts up. (Not a big surprise. What was a surprise was that globby was doing all the b0rking and once I got fed up and hardcoded the search result set using UNIX find and some shell scripting + fs.readfileSync().split('\n') instead of globby all tests behaved as before, no sweat.

Anyway, that's where I chose to file an issue this time around. And here we are. 😅

Anyway, I hope some poor bugger will find some useful help in all this.


Which changes this to maybe a choice:

a. make globby cope with backslash paths (or at least warn about them entering the premises, maybe?) despite \ being a (ahem) legal filename character on UNIX. (Note the K&P quote in https://stackoverflow.com/questions/1976007/what-characters-are-forbidden-in-windows-and-linux-directory-names for extra fun.) b. add a bit to the documentation about Windows folks being extra special careful with them path babies.

(Incidentally, I bet the other globber libs out there suffer the same issue. I seem to recall some similar struggle with node-glob or what-was-it-I-used-back-in-the-day?)

AndrewLeedham commented 3 years ago

@GerHobbelt @phawxby I believe this was deliberate and was flagged as a breaking change in the v10.0.0 release, as fast-glob changed this behaviour in their v3.0.0 release.

Fast-globs proposed workaround is .replace(/\\/g, '//') https://github.com/micromatch/micromatch#backslashes. So I don't see this being fixed in globby as it will break special character escaping.

SrBrahma commented 3 years ago

I am only using globby to get all the deep filenames from a given cwd (so the results won't contain the parents dirs). The glob pattern is just ''. Do I need to care about Windows/Posix fs? Because it says the cwd defaults to process.cwd(), and it on Windows, will contain backslashes.