kislyuk / yq

Command-line YAML, XML, TOML processor - jq wrapper for YAML/XML/TOML documents
https://kislyuk.github.io/yq/
Apache License 2.0
2.62k stars 84 forks source link

Complicated filters seem to get transformed incorrectly on the way to jq #69

Closed scr-oath closed 5 years ago

scr-oath commented 5 years ago
xq '.testsuites.testsuite | {errors: ([.[]["@errors"] // empty | tonumber] | add)}' TESTS-TestSuites.xml
jq: error (at <stdin>:1): errors is not a valid format

vs.

xq . TESTS-TestSuites.xml | jq '.testsuites.testsuite | {errors: ([.[]["@errors"] // empty | tonumber] | add)}'
{
  "errors": 0
}

With input file TESTS-TestSuites.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<testsuites>
  <testsuite errors="0" failures="0" hostname="thehost" id="0" name="the-test" package="the-package" skipped="0" tests="3" time="0.006" timestamp="2019-09-12T01:43:27">
    <properties />
  </testsuite>
  <testsuite errors="0" failures="0" hostname="thehost" id="0" name="the-other-test" package="the-package" skipped="0" tests="3" time="0.006" timestamp="2019-09-12T01:43:27">
    <properties />
  </testsuite>
</testsuites>
scr-oath commented 5 years ago

By hacking up the PATH to make jq be my own concoction, I'm able to see that it passes

.testsuites.testsuite | {errors: ([.[][@errors] // empty | tonumber] | add)}

Instead of

.testsuites.testsuite | {errors: ([.[]["@errors"] // empty | tonumber] | add)}

i.e. it somehow eats the " character around the "@errors"

kislyuk commented 5 years ago

yq does not manipulate or interpret the filter. I suspect your shell is eating the characters somewhere along the way. What shell are you using? Is xq '.testsuites.testsuite | {errors: ([.[]["@errors"] // empty | tonumber] | add)}' TESTS-TestSuites.xml the exact literal command, or are you running it from a subshell, script or some other outer layer?

scr-oath commented 5 years ago

I'm running that from bash, so it's unescaping only the outer ' - you can do the same and make your own jq and see that the argument that jq is passed by way of xq/yq is different than what you pass to xq/yq.

kislyuk commented 5 years ago

When I do that, the filter is printed with the quotes intact.

scr-oath commented 5 years ago

Interesting… ok - will dig deeper…

kislyuk commented 5 years ago

For your reference, here is the fake jq that I made to check this:

#!/usr/bin/env python3
import sys
with open("/home/Andrey/test.out", "w") as fh:
    for arg in sys.argv:
        print(arg, file=fh)
scr-oath commented 5 years ago

I have a hunch… I think it's scl that's doing it… gah…

Ok, if I do

scl enable rh-python36 -- xq '.testsuites.testsuite | {errors: ([.[]["@errors"] // empty | tonumber] | add)}' TESTS-TestSuites.xml

I get the bad behavior; if I do

scl enable rh-python36 bash
# inside the bash prompt, do
xq '.testsuites.testsuite | {errors: ([.[]["@errors"] // empty | tonumber] | add)}' TESTS-TestSuites.xml

Then it works…

scr-oath commented 5 years ago

so scl isn't passing its args as args, it's piping through bash -c and then going through a round of escaping that's eating it 👎 that's too bad, but at least we can close this… thanks for your curiosity/patience :-D

scr-oath commented 5 years ago
[scr@C02VD2N0HTDD]$ scl enable rh-python36 -- ~/bin/printargs 'foo bar "baz"'
foo\ bar\ baz
[scr@C02VD2N0HTDD]$ ~/bin/printargs 'foo bar "baz"'
foo\ bar\ \"baz\"

With my printer as

[scr@C02VD2N0HTDD]$ cat ~/bin/printargs
#!/usr/bin/env bash

printf "%q\n" "$@"
scr-oath commented 5 years ago

FWIW… if you ever run into this, this seems to be the solution:

scl enable rh-python36 "$(printf "%q " "$(basename $0)" "$@")"
kislyuk commented 5 years ago

Never heard of scl before, but doing an extra shell interpretation like that is bound to cause problems in a lot of things.

scr-oath commented 5 years ago

Yeah agreed that escaping is always a Pandora’s box. FWIW, scl is a red hat thing kind of like virtual env but for any type of “software collection” (the first two letters in the tool name). In this case it was failure to escape. Scl claimed to have two forms: a single arg (which should be passed through bash -c arg, and one with multiple args that should have been clean - whatever you pass to it either with execve down in c++ land or via shell arg handling should be passed to the command. However... it was the absolute worst case - taking all the args and joining with space to pass to bash -c. No hope in hell of having the args as you passed them get to your tool without going way too far into knowing/guessing that it was doing that and fighting back with a hack. Ugly!

scr-oath commented 5 years ago

Ok may have been reading out of date manpage. Now scl only describes the single command. So great have to escape. Sorry for the mixup