rycus86 / githooks

Githooks: per-repo and global Git hooks with version control
MIT License
382 stars 20 forks source link

Whitespaces in paths break for loop #29

Closed gabyx closed 5 years ago

gabyx commented 5 years ago

I searched for all find statements and found some which do not work with whitespaces in path names, I corrected them.

Thanks for reviewing this.

coveralls commented 5 years ago

Pull Request Test Coverage Report for Build 251


Changes Missing Coverage Covered Lines Changed/Added Lines %
install.sh 9 10 90.0%
<!-- Total: 36 37 97.3% -->
Totals Coverage Status
Change from base Build 237: 0.09%
Covered Lines: 1597
Relevant Lines: 1760

💛 - Coveralls
gabyx commented 5 years ago

Jeah, of course.

gabyx commented 5 years ago

Maybe lets add the tests later in another PR. We should actually run all tests against some whitespaces. I can probably test this by simply hacking the exec-tests.sh and looking for failed tests.

rycus86 commented 5 years ago

I'm more concerned about IFS on all platforms, so a duplicate test would have the benefit of testing it on all the platforms that are configured already.

gabyx commented 5 years ago

Ah men: Shitty IFS crap is so dangerous:

   echo "$LOCAL_REPOSITORY_LIST" | while IFS= read -r EXISTING; do
        echo "Ex: $EXISTING"
        EXISTING_REPO_ROOT=$(dirname "$EXISTING")
        install_hooks_into_repo "$EXISTING_REPO_ROOT"
    done

Output

Do you want to install the hooks into existing repositories? [y/N] Where do you want to start the search? [/root]
LIST: /tmp/test045/001/.git
/tmp/test045/002/.git
with xargs:
f: /tmp/test045/001/.git
f: /tmp/test045/002/.git
Ex: /tmp/test045/001/.git  <<<<.---- WHY ONLY THE FIRST one??
Looks like you don't have a .githooks folder in the /tmp/test045/001 repository yet.

does not work, but if I comment out install_hooks_into_repo then the loop processes all \n delimited paths correctly, otherwise only the first is processes....???

I prefere something more robust instead: xargs

gabyx commented 5 years ago

Also the decision to not have ' quotes doesnt enlighten me much, since sometimes you want to control or restrict expansion, and this decision only makes it harder... at least for me...

gabyx commented 5 years ago

Ah its because the install script mocks out </dev/tty which in my case reads my loop stdin... shit.

gabyx commented 5 years ago

The cleanest way I found is the following in f9a214456 the while loop changes the stdinput which is bad and undefined behavior for successive calls. I testet various methods (for loop, with IFS resetting, looks totally ugly) etc...

gabyx commented 5 years ago

I added the test: step-045w.sh with whitespaces which would fail if this bugfix is not applied.

gabyx commented 5 years ago

I replaced all tests to have whitespaces in a lot of paths (next two commits), this is definitely something we need to check and detect failure early, Thats why I suggest we use all these new tests

Lots of tests failed, due to wrong IFS and other stuff... Meaning in this way I found lots of errors. The tests (did not need any changes) all pass and work 👍 now except test-094.sh which downloads the master file, which of course does not work with whitespaces -> When you merge this test should succed as well. Thanks for reviewing this and merging this in. This is super important. Whitespaces should not mess up things

rycus86 commented 5 years ago

Sorry, I was unavailable for a few days, just looking at this now. 82 changed files though doesn't really help. :) Can we first do the actual change and perhaps add one or two new tests rather than change all the existing ones?

On the actual whitespace splitting problem, I just tried this in alpine's shell:

# find . -type f
./def ghi/test
./abc/test file
./jkl/sample
./jkl/mno/another file
# RLIST=$(find . -type f)
#
# this below is what you're trying to fix, right?
# for T in $RLIST; do echo "Item: $T"; done
Item: ./def
Item: ghi/test
Item: ./abc/test
Item: file
Item: ./jkl/sample
Item: ./jkl/mno/another
Item: file
# IFS=",
"
# for T in $RLIST; do echo "Item: $T"; done
Item: ./def ghi/test
Item: ./abc/test file
Item: ./jkl/sample
Item: ./jkl/mno/another file

Should we then just ensure all unquoted for loops are executed with the appropriate IFS set?

gabyx commented 5 years ago

Correct. But setting the IFS is really a bit buggy. Because if we set it, then all commands inside the for loop have the same IFS, which is buggy (when we change things... nested for loop...)

NIFS=",
"
IFS="$NIFS"
for T in $RLIST; do 
   unset IFS
   command args1
   IFS="$NIFS"
done

I dont know if read also uses the IFS... I force pushed two commits, the Bugfix, and all tests with whitespaces... Could you look at the changes first? The tests are only whitespace replacements, nothing changed there.

gabyx commented 5 years ago

The robust reference is: https://unix.stackexchange.com/questions/9496/looping-through-files-with-spaces-in-the-names

gabyx commented 5 years ago

by the way: image In this way I used:

 # We dont want to change the std input through
    # a while loop, so we redirect over descriptor 3
    INPUT_PIPE=$(mktemp)
    rm -rf "$INPUT_PIPE" >/dev/null 2>&1 || true
    mkfifo "$INPUT_PIPE"
    find "$START_DIR" -path "*templates/hooks/pre-commit.sample*" 2>/dev/null >"$INPUT_PIPE" &
    while IFS= read -r HIT <&3; do
gabyx commented 5 years ago

I make these changes in the urge to use this tool on windows, but as you know windows is crap and also whitespaces in paths are used a lot there... Maybe we should also add an appveyor test or something like that :-), basically just installing gitforwindows and then run everything there :-)

rycus86 commented 5 years ago

Thanks! I had the pain privilege of working with Windows for a long time, I don't envy you... :) For SC2044, my understanding is that find | grep is less buggy in a for loop than plain finds, not sure where I've got this from and could be wrong. All the changes look OK to me, except the ones that now use temporary files and pipes and whatnot. Is there really nothing simpler we can do? Those are two loops only, even setting IFS multiple times would be simpler I think.

gabyx commented 5 years ago

Jeah I dont like the pipe too. Maybe we can just use the IFS and stick with it if it works. Let me change it.

rycus86 commented 5 years ago

OK, I think this almost looks ready go. :+1: Added some comments for some style changes, plus please revert all the test changes for now. One (new) whitespace test would be nice, but I'm happy to add one myself after merging this PR. And I'll also add a new Travis job to automatically change all the paths in the test scripts to have a space, and test the whitespaces in general that way.

rycus86 commented 5 years ago

Oh step-006 was failing on Travis with your latest changes. :thinking:

gabyx commented 5 years ago

OK, I think this almost looks ready go. 👍 Added some comments for some style changes, plus please revert all the test changes for now. One (new) whitespace test would be nice, but I'm happy to add one myself after merging this PR. And I'll also add a new Travis job to automatically change all the paths in the test scripts to have a space, and test the whitespaces in general that way.

That sounds perfect. I do the review in moment. What do you mean by step-006 was failing? I check the tests when I ve done the review. Lets see. Thanks :-)

gabyx commented 5 years ago

Can we enumerate LOOP_IFS_<linenumber> and have a pre-commit script which reassigns the line numbers always...?

HAVING two LOOP_IFS for nested for loops is the same problem as with the IFS...

I coming to the feeling that shell ist just the wrong tool for these kind of things, its a crappy language ;-)?

rycus86 commented 5 years ago

Can we enumerate LOOP_IFS_<linenumber> and have a pre-commit script which reassigns the line numbers always...?

Why is this a concern?

gabyx commented 5 years ago

Because of nestet for loops: Using a LOOP_IFS=... might might be reset if another shell function iside the loop resets LOOP_IFS to something else... Maybe define global< some IFS:

IFS_COMMA_NEWLINE
IFS_NEWLINE

and only use them to reassign IFS


IFS="$IFS_COMMA_NEWLINE"
for ...
unset IFS
...
IFS="$IFS_COMMA_NEWLINE"
done
rycus86 commented 5 years ago

Because of nestet for loops: Using a LOOP_IFS=... might might be reset if another shell function iside the loop resets LOOP_IFS to something else...

OK, what I meant to ask is: is this a real concern here? Does it happen already? How hard/easy it is to make it happen?

If it's not super easy to mess up, I'd not worry about it for now.

gabyx commented 5 years ago

See my latest commit. global IFS to use in LOOPS, that should fix also potential future problems. It think this way its kind of more robust.

rycus86 commented 5 years ago

Nice, looks good to me! :+1: Let's wait for Travis then I'll merge it in.

rycus86 commented 5 years ago

Hm, somethings seems to have broken around shared repo lists, see steps 082 and 092.

gabyx commented 5 years ago

Seen it :-) Will investigate.

rycus86 commented 5 years ago

Hm, somethings seems to have broken around shared repo lists, see steps 082 and 092.

OK, found the problem I think, left a comment.

gabyx commented 5 years ago

Thx!

rycus86 commented 5 years ago

Thx!

Thank you! :)