qri-io / qri

you're invited to a data party!
https://qri.io
GNU General Public License v3.0
1.11k stars 66 forks source link

`qri save` segfaults with transform #1207

Closed feep closed 4 years ago

feep commented 4 years ago

[Changed title of issue, I no longer think it has anything to do with renaming the directory.]

qri 0.9.6, macOS 10.14.6

No test case, sorry. If I can reproduce, I'll add one.

Something like:

  1. Working linked dataset directory.
  2. Rename for a better git commit (Didn't know this was against the rules).
  3. Make a transform.star change after commit.
  4. Run qri save.
  5. segfault.

Trying to make a test case, I moved it back to where it belonged. Success.

Renamed it again, I could not make it segfault again. Proper 'missing file' sort of errors.

What I think happened?

transform worked with the correct path. When I put it back in the wrong place the diff code didn’t get called anymore because there was no longer a diff.

I did some things to cause diffs in the wrong directory again. But I keep getting a proper error instead of a segfault.

Anyway, that’s all I got.

@ me if you want me to keep poking at it.

✅ transform complete
⠙ panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x464fca8]

goroutine 32 [running]:
github.com/qri-io/deepdiff.bestCandidate(0xc0010a8420, 0x2, 0x2, 0x58d0460, 0xc000487500, 0x1049)
    /Users/dlong/dev/code/go/qri/pkg/mod/github.com/qri-io/deepdiff@v0.1.1-0.20200305020550-8173efebcaa1/deepdiff.go:196 +0x248
github.com/qri-io/deepdiff.(*diff).queueMatch.func1(0xc00014e5a0, 0xc0003aef90, 0xc000909710, 0x1049, 0xc00014e660)
    /Users/dlong/dev/code/go/qri/pkg/mod/github.com/qri-io/deepdiff@v0.1.1-0.20200305020550-8173efebcaa1/deepdiff.go:154 +0x182
created by github.com/qri-io/deepdiff.(*diff).queueMatch
    /Users/dlong/dev/code/go/qri/pkg/mod/github.com/qri-io/deepdiff@v0.1.1-0.20200305020550-8173efebcaa1/deepdiff.go:131 +0xec
dustmop commented 4 years ago

Is this the full stacktrace? If not, could you post the entire thing?

feep commented 4 years ago

Yup. That's all I got.

If it's missing lines, I have no idea why.

dustmop commented 4 years ago

Ok, could be a goroutine thing. Thanks for filing, I'll try digging into it later.

Also, to clarify, renaming a folder isn't so much "against the rules", it's just that we considered it a bit more advanced case, and so haven't prioritized it. It's something we definitely want to support soon.

feep commented 4 years ago

Sorry. When I said "against the rules," I meant "breaks things."

I don’t know the rules. =]

If you run into a wall when you’re looking into this, I may be able to come up with a test case. I know it will take time so I'd rather skip it for now. Hopefully you can find the bug without one.

Let me know if hit a wall and want me to try to write one.

feep commented 4 years ago

qri 0.9.7-dev 8c302c79d7eee9c0e804db792e036ae973f9ae3c

Mailed ca_state_parks_dustmop.zip to @dustmop.

Contents

Why I'm afraid to generate a test case

  1. It only fails on qri save.
  2. If I try to trim down structure.json and it does not segfault, the state is changed at that point and I no longer have a test case for you.

What I remember

qri save --no-color --dry-run -f ./transform.star me/humba no segfault, works fine

qri save with transform (now renamed transform_disabled.star): segfault

qri save --no-color --dry-run -f ./transform.star me/humba, then copy pasted structure into structure.json to see if I could work around the crash.

qri save with hidden transform, working body.csv and structure.json copypasted from --dry-run: segfault

With any luck this should do something like this:

❯ qri save
for linked dataset [feep/ca_state_parksx]

⠋ 📡 running download...
🤖  running transform...
✅ transform complete
⠙ panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x38 pc=0x46531e8]

goroutine 37 [running]:
github.com/qri-io/deepdiff.bestCandidate(0xc000718000, 0x2f, 0x40, 0x5a26800, 0xc0006b5980, 0x2228)
    /Users/rusty/go/pkg/mod/github.com/qri-io/deepdiff@v0.1.1-0.20200305020550-8173efebcaa1/deepdiff.go:196 +0x248
github.com/qri-io/deepdiff.(*diff).queueMatch.func1(0xc00049a0c0, 0xc000189ec0, 0xc0006b3650, 0x2228, 0xc00049a120)
    /Users/rusty/go/pkg/mod/github.com/qri-io/deepdiff@v0.1.1-0.20200305020550-8173efebcaa1/deepdiff.go:154 +0x182
created by github.com/qri-io/deepdiff.(*diff).queueMatch
    /Users/rusty/go/pkg/mod/github.com/qri-io/deepdiff@v0.1.1-0.20200305020550-8173efebcaa1/deepdiff.go:131 +0xec

❯ rm rm body.csv structure.json

❯ cp transform_disabled.star transform.star

❯ qri save

# Should segfault in a very similar way.
dustmop commented 4 years ago

Question: are you running this command you posted above:

qri save --no-color -- dry-run -f ./transform_disabled.star me/humba > dry_stdout.txt 2> dry_stderr.txt

or:

qri save --no-color --dry-run -f ./transform_disabled.star me/humba > dry_stdout.txt 2> dry_stderr.txt

In other words, is there a space between -- and dry-run?

dustmop commented 4 years ago

I tried making a new checked out dataset, then copying the provided files (body.csv, structure.json, meta.json) into it. Ran qri save and it works okay. Then I copied in the transform_disabled.star file and renamed it to transform.star. When I try to save I get:

> qri save
for linked dataset [dustmop/some_ds]

transform.star:100:27: invalid escape sequence \[

Which looks like it refers to this line:

    jsonblob = re.findall("\[.*]", jsonblobhits[0])[0]

Can't reproduce this segfault.

feep commented 4 years ago

In other words, is there a space between -- and dry-run?

No. That was a copypaste error trying to explain how I generated stderr.txt and stderr.txt.

I'll try with 0.9.7 release later. Someday I'll get you a test case.

Thanks.

dustmop commented 4 years ago

This is now fixed by https://github.com/qri-io/deepdiff/pull/11 and https://github.com/qri-io/qri/commit/2f5e42cc6bbe751c3961fa6117f95a060de10574, and the newest release of qri includes the fix.