jqlang / jq

Command-line JSON processor
https://jqlang.github.io/jq/
Other
30.47k stars 1.58k forks source link

-f filename crashes jq on windows #1072

Closed jankatins closed 8 years ago

jankatins commented 8 years ago

I want to use jq as a git smudge filter for ipython notebooks, but when I tried to run jq with jq -f c:\path\to\ipynb_filter.jq, it crashes: "jq does not work anymore" popup and the question if I want to debug (which I have no idea how to do: I only see some "exception at "...).

jankatins commented 8 years ago

I'm not sure what is happening here:

jq -f C:\Users\ipynb_stripoutput.jq
jq: Could not open C:\Users\ipynb_stripoutput.jq: No such file or directory
jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?) at <top-level>, line 1:
def banner: "\(.) " + (28-(.|length))*"-";
jq: 1 compile error
jankatins commented 8 years ago

This again lets it crash:

jq "." CoAuthorNet_1_loadData.ipynb

[Using cat CoAuthorNet_1_loadData.ipynb | jq "." works)

The problem is that this is the way which would be used by a git diff driver ala

[diff "ipynb_off"]
    textconv = "jq -f \"C:\\Users\\jschulz\\Dropbox\\Programme\\cmder\\vendor\\jasc\\nbflatten.jq\" --arg show_output 0"
    cachetextconv = false

This would result in the following two calls to jq:

jq -f C:\Users\jschulz\Dropbox\Programme\cmder\vendor\jasc\nbflatten.jq --arg show_output 0 CoAuthorNet_1_loadData.ipynb
jq -f C:\Users\jschulz\Dropbox\Programme\cmder\vendor\jasc\nbflatten.jq --arg show_output 0 C:/Users/jschulz/AppData/Local/Temp/hk7trc_CoAuthorNet_1_loadData.ipynb 
nicowilliams commented 8 years ago

Was one of the crashes a result of using long paths? The maximum path length on Windows is 260 bytes.

I've tried using Windows-style paths and those work for me, so that's not it. Non-existent paths also don't cause jq to crash on Windows.

As for program file contents as UTF-8 with a BOM... Hmmm, perhaps we should support that, yes.

This does crash jq for me: jq "." CoAuthorNet_1_loadData.ipynb, but if I rename the file to CoAuthorNet_1_loadData.json then it does not crash. Is .ipynb special? No, renaming it to CoAuthorNet_1_loadData.jsoni also yields a crash. I'm willing to believe the problem is in the C run-time.

jankatins commented 8 years ago

re long path: that's the exact command:

c:\data\phd\academicsocialcapital (master)
λ jq -f C:\Users\jschulz\Dropbox\Programme\cmder\vendor\jasc\test.jq --arg show_output 0 test.json

test.jq is simple . (ANSI encoded, so no BOM), test.json is ["jan",1,2,3] (ANSI)

Re renamed ipynb: same here: renaming it to json works. But the above does not, it still crashes if I use the above -f argument. ipynb are IPython/Jupyter Notebook files: https://github.com/blog/1995-github-jupyter-notebooks-3

I try to implement this: https://gist.github.com/takluyver/bc8f3275c7d34abb68bf by using this file: https://gist.github.com/jfeist/cd00aa3b681092e1d5dc

I no got it to work by using this batch file as a git diff textconv, but this is a kind of hack :-( :

@echo off
setlocal
SET mypath=%~dp0
echo %mypath:~0,-1%

set cur_arg=%1
:loop
shift
if [%1]==[] goto afterloop
set params=%params% %cur_arg%
set cur_arg=%1
goto loop
:afterloop

set input_file=%cur_arg%
set rest_of_args=%params%
cat %input_file% | jq -f %mypath%\nbflatten.jq %rest_of_args%
endlocal

nbflatten.jq:

def banner: "\(.) " + (28-(.|length))*"-";
("Non-cell info" | banner), del(.cells), "",
(.cells[] |  ("\(.cell_type) cell" | banner),
         "\(.source|add)",
         if ($show_output == "1") then
                 "",
         ( select(.cell_type=="code" and (.outputs|length)>0) |
           ("output" | banner),
           (.outputs[] |
            (select(.text) | "\(.text|add)" | rtrimstr("\n")),
            (select(.traceback) | (.traceback|join("\n"))),
            (select(.text or .traceback|not) | "(Non-plaintext output)")
           ),
           ""
         )
              else ""
          end
)
nicowilliams commented 8 years ago

@JanSchulz There are things I can do to crash jq, but the command you give (with slightly different directory names) works for me. Oddly, I get it to crash when the data file path is absolute and that long. The failure happens when I use C:\Users\username\Dropbox\Programme\cmder\test.son as the data filename. Shorter paths work fine.

nicowilliams commented 8 years ago

Oh, I think I know what it is. The result of MultiByteToWideChar() is a count of widechars, not bytes.

nicowilliams commented 8 years ago

Yes, that fixes it.

@dtolnay Should we now make a 1.6 or a 1.5.1 release? What think ye?

nicowilliams commented 8 years ago

I wish I could attach executables here...

jankatins commented 8 years ago

Getting the jq.exe from here: https://ci.appveyor.com/project/JanSchulz/jq/build/1.0.27/artifacts (https://github.com/stedolan/jq/pull/1074)

I can comfim, that this works now:

jq -f "C:\\Users\\jschulz\\Dropbox\\Programme\\cmder\\vendor\\jasc\\nbflatten.jq" --arg show_output 0 Editors_0_GenerateLists.ipynb

Will test the rest later...

jankatins commented 8 years ago

@nicowilliams @dtolnay any news on a new release?

nicowilliams commented 8 years ago

The only way I'd do a release is if we just release master in its current state as either 1.5.1 or 1.6 (I prefer 1.6). I'm not up for backporting bug fixes to the 1.5 branch (which is gone but can get recreated from the jq-1.5 tag) -- I'm ETOOBUSY with other things. @dtolnay hasn't said anything, which I suspect means he's even busier than I, so we may have to wait a bit.