src-d / gitbase

SQL interface to git repositories, written in Go. https://docs.sourced.tech/gitbase
Apache License 2.0
2.06k stars 124 forks source link

Non-persistent behavior on logging bblfsh parse errors #955

Closed lwsanty closed 4 years ago

lwsanty commented 5 years ago

Having test in bblfsh-mockups regression suite One of it's sub-cases uses bblfsh mockup that returns such response

&v2.ParseResponse{Language: "go", Uast: bufData, Errors: nil},

where bufData = getUASTBytes("a")

func getUASTBytes(text string) ([]byte, error) {
    node := nodes.String(text)
    buf := bytes.NewBuffer(nil)
    if err := nodesproto.WriteTo(buf, node); err != nil {
        return nil, err
    }
    return buf.Bytes(), nil
}

expected gitbase output

time="2019-08-23T11:44:03+03:00" level=info msg="server started and listening on localhost:3306"
time="2019-08-23T11:44:04+03:00" level=info msg="NewConnection: client 1"
time="2019-08-23T11:44:04+03:00" level=info msg="audit trail" action=authentication address="127.0.0.1:58320" success=true system=audit user=root
time="2019-08-23T11:44:04+03:00" level=info msg="audit trail" action=authorization address="127.0.0.1:58320" connection_id=1 permission=read pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:04+03:00" level=warning msg="unable to parse the given blob using bblfsh: rpc error: code = Canceled desc = context canceled"
time="2019-08-23T11:44:04+03:00" level=info msg="audit trail" action=query address="127.0.0.1:58320" connection_id=1 duration=40.864078ms pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:04+03:00" level=info msg="ConnectionClosed: client 1"

the most crucial line to assert is

time="2019-08-23T11:44:04+03:00" level=warning msg="unable to parse the given blob using bblfsh: rpc error: code = Canceled desc = context canceled"

observed in 20% of this test runs the output does not contain parse error line

time="2019-08-23T11:44:11+03:00" level=info msg="server started and listening on localhost:3306"
time="2019-08-23T11:44:12+03:00" level=info msg="NewConnection: client 1"
time="2019-08-23T11:44:12+03:00" level=info msg="audit trail" action=authentication address="127.0.0.1:58326" success=true system=audit user=root
time="2019-08-23T11:44:12+03:00" level=info msg="audit trail" action=authorization address="127.0.0.1:58326" connection_id=1 permission=read pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:12+03:00" level=info msg="audit trail" action=query address="127.0.0.1:58326" connection_id=1 duration=48.694585ms pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-08-23T11:44:12+03:00" level=info msg="ConnectionClosed: client 1"

note test uses stdout+stderr capturer and at first it may be a delusion that capturer does not have time to capture this part of log, but it's not true, because: 1) I have checked this test with different pre-delays(0.5-5 seconds) before capture 2) imo this warning should appear during the query execution and afaik if ConnectionClosed: client 1 has appeared no additional query-related logs will be observed

erizocosmico commented 5 years ago

Warn should be printed if it's a blob parse error: https://github.com/src-d/gitbase/blob/000ef21ab45467f159ce5e2362df1a701790764e/internal/function/uast_utils.go#L124

erizocosmico commented 4 years ago

I've managed to reproduce this, but after much investigation I have no idea why this is happening, since the log is there. If anyone else wants to give this a try, feel free.

agarciamontoro commented 4 years ago

I spent the day yesterday with this and no clue either. Could it be possible that the error comes from some weird race condition when capturing the output, @lwsanty? I'm just throwing ideas here...

lwsanty commented 4 years ago

@agarciamontoro I did a couple additional checks for the main suspect of std capturer 1) I ran tests with race detector flag several times result no races detected and still able to reproduce

2) I removed capturer from the code and still could reproduce the issue

lwsanty commented 4 years ago

Update gitbase option --parallelism=1 heals data races, but bug is still reproduced and is reproduced constantly now


On the other hand, if gitbase is compiled with race detector then test is reproduced with race

[2019-09-27T17:15:01.168485901+03:00]  INFO =====> TestParseErrorWarnings
[2019-09-27T17:15:01.169418541+03:00]  INFO Preparing run
[2019-09-27T17:15:01.169521102+03:00]  INFO Downloading repositories
[2019-09-27T17:15:01.321752243+03:00]  INFO Preparing gitbase binaries
[2019-09-27T17:15:02.153010817+03:00]  INFO Binary for remote:debug-955 (6679fb0e3b54058dbf4d809b7b65067b90ef3104) already built
[2019-09-27T17:15:02.158484857+03:00]  INFO Running query query={ID: Name: Statements:[select file_path, uast(blob_content) name 
from refs natural 
join commit_files natural 
join blobs 
where LANGUAGE(file_path) = 'Go' 
limit 10]} version=remote:debug-955
[2019-09-27T17:15:02.159184868+03:00]  INFO Executing gitbase test
[2019-09-27T17:15:03.293164969+03:00]  INFO runQueryCtx -> defer
[2019-09-27T17:15:04.304478201+03:00]  INFO out: time="2019-09-27T17:15:02+03:00" level=info msg="squash tables rule is enabled"
time="2019-09-27T17:15:02+03:00" level=info msg="server started and listening on localhost:3306"
time="2019-09-27T17:15:03+03:00" level=info msg="NewConnection: client 1"
time="2019-09-27T17:15:03+03:00" level=info msg="audit trail" action=authentication address="127.0.0.1:43160" success=true system=audit user=root
time="2019-09-27T17:15:03+03:00" level=info msg="audit trail" action=authorization address="127.0.0.1:43160" connection_id=1 permission=read pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> return node, nil"
==================
WARNING: DATA RACE
Write at 0x00c0000d7198 by goroutine 21:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:269 +0x105
  github.com/src-d/go-mysql-server/sql/plan.(*limitIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/limit.go:77 +0x55
  github.com/src-d/go-mysql-server/sql.(*spanIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/session.go:416 +0x71
  github.com/src-d/go-mysql-server/sql/plan.(*trackedRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/process.go:142 +0x73
  github.com/src-d/go-mysql-server/server.(*Handler).ComQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/handler.go:280 +0xf15
  vitess.io/vitess/go/mysql.(*Conn).execQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:819 +0x20b
  vitess.io/vitess/go/mysql.(*Conn).handleNextCommand()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:749 +0x1705
  vitess.io/vitess/go/mysql.(*Listener).handle()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:441 +0x13f8

Previous read at 0x00c0000d7198 by goroutine 27:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).iterPartitions()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:182 +0x106

Goroutine 21 (running) created at:
  vitess.io/vitess/go/mysql.(*Listener).Accept()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:243 +0x18a
  github.com/src-d/gitbase/cmd/gitbase/command.(*Server).Execute()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/server.go:79 +0xe43
  github.com/jessevdk/go-flags.(*Parser).ParseArgs()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:316 +0x110a
  main.main()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:186 +0x8a6

Goroutine 27 (finished) created at:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).start()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:135 +0xa0
==================
==================
WARNING: DATA RACE
Read at 0x00c000d34d40 by goroutine 21:
  github.com/src-d/go-borges/libraries.(*mergedRepoIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-borges@v0.1.3/libraries/iterator.go:57 +0x3e
  github.com/src-d/gitbase.(*repositoryPartitionIter).Close()
      /home/lwsanty/goproj/lwsanty/gitbase/partition.go:96 +0x7b
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:273 +0xcb
  github.com/src-d/go-mysql-server/sql/plan.(*limitIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/limit.go:77 +0x55
  github.com/src-d/go-mysql-server/sql.(*spanIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/session.go:416 +0x71
  github.com/src-d/go-mysql-server/sql/plan.(*trackedRowIter).Close()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/process.go:142 +0x73
  github.com/src-d/go-mysql-server/server.(*Handler).ComQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/handler.go:280 +0xf15
  vitess.io/vitess/go/mysql.(*Conn).execQuery()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:819 +0x20b
  vitess.io/vitess/go/mysql.(*Conn).handleNextCommand()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/conn.go:749 +0x1705
  vitess.io/vitess/go/mysql.(*Listener).handle()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:441 +0x13f8

Previous write at 0x00c000d34d40 by goroutine 27:
  github.com/src-d/go-borges/libraries.(*mergedRepoIter).Next()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-borges@v0.1.3/libraries/iterator.go:37 +0x1ae
  github.com/src-d/gitbase.(*repositoryPartitionIter).Next()
      /home/lwsanty/goproj/lwsanty/gitbase/partition.go:79 +0x61
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).iterPartitions()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:187 +0x26f

Goroutine 21 (running) created at:
  vitess.io/vitess/go/mysql.(*Listener).Accept()
      /home/lwsanty/goproj/gopath/pkg/mod/vitess.io/vitess@v3.0.0-rc.3.0.20190602171040-12bfde34629c+incompatible/go/mysql/server.go:243 +0x18a
  github.com/src-d/gitbase/cmd/gitbase/command.(*Server).Execute()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/server/server.go:79 +0xe43
  github.com/jessevdk/go-flags.(*Parser).ParseArgs()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:316 +0x110a
  main.main()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/jessevdk/go-flags@v1.4.0/parser.go:186 +0x8a6

Goroutine 27 (finished) created at:
  github.com/src-d/go-mysql-server/sql/plan.(*exchangeRowIter).start()
      /home/lwsanty/goproj/gopath/pkg/mod/github.com/src-d/go-mysql-server@v0.4.1-0.20190821121850-0e0249cf7bc0/sql/plan/exchange.go:135 +0xa0
==================
time="2019-09-27T17:15:03+03:00" level=info msg="audit trail" action=query address="127.0.0.1:43160" connection_id=1 duration=128.906846ms pid=1 query="select file_path, uast(blob_content) name \nfrom refs natural \njoin commit_files natural \njoin blobs \nwhere LANGUAGE(file_path) = 'Go' \nlimit 10" success=true system=audit user=root
time="2019-09-27T17:15:03+03:00" level=info msg="getUASTFromBblfsh -> parsing..."
time="2019-09-27T17:15:03+03:00" level=info msg="ConnectionClosed: client 1"

[2019-09-27T17:15:04.304984522+03:00]  INFO error occurred: repeats of "level=warning": exp: 1 act: 0
[2019-09-27T17:15:04.305168782+03:00]  INFO =====> done
[2019-09-27T17:15:04.305304968+03:00]  INFO 1 tests have failed:
[TestParseErrorWarnings]
exit status 1
lwsanty commented 4 years ago

have one more case, certainly related to this one, but having a slightly different behavior

the other test sub-case that returns mocked response

&v2.ParseResponse{Language: "go", Uast: bufData, Errors: []*v2.ParseError{{"only one parse error"}}},

is still dynamically failing in the same way to the previous test, however if gitbase runs with parallelism=1 flag this test is PASSed in 100 runs out of 100, contrasting to the previous case that fails in 100% of cases when this flag is set

lwsanty commented 4 years ago

couple of updates here 1) https://github.com/src-d/gitbase/issues/955#issuecomment-541022646 is completely fixed with the new version of go-mysql-server 2) I had a talk with @dennwc and figured out that mocked response from https://github.com/src-d/gitbase/issues/955#issue-484425450 is valid one so no errors should be returned, thus this test works as expected during parallelism=1 and fails with context canceled with default parameter.

I suspect that for concurrent case some of the requests cancels the context that forces other requests to fail

lwsanty commented 4 years ago

OK, I think I can minimize and change the bug's description if parallelism != 1 and some query performed, for example

select file_path, uast(blob_content) name 
from refs natural 
join commit_files natural 
join blobs 
where LANGUAGE(file_path) = 'Go' 
limit 10

it causes context cancel, but it should not

lwsanty commented 4 years ago

I have done a small investigation according to my previous comment and wanted to mention on more version here.

IMO the problem of canceled context lies in the specification of go-mysql-server in parallelism mode with the queries that have a limit. Let's consider the cases:

lwsanty commented 4 years ago

As discussed with @erizocosmico exchange nodes do not know parent nodes which limit nodes + they work unsynced. At the same time this extra exchange routine does not affect anything harmfully as it fails with context cancel with time. Fixing it is probably time consuming.

Taking everything into account we agreed to keep this behavior as a feature. From my side I will update the test suite to ignore context canceled error.