dgraph-io / dgraph

The high-performance database for modern applications
https://dgraph.io
Other
20.43k stars 1.5k forks source link

[BUG]: Unclosed regular expression query causes dgraph crash #8732

Closed jx6f closed 1 year ago

jx6f commented 1 year ago

What version of Dgraph are you using?

v22.0.2(docker):

Dgraph version   : v22.0.2
Dgraph codename  : dgraph
Dgraph SHA-256   : a11258bf3352eff0521bc68983a5aedb9316414947719920d75f12143dd368bd
Commit SHA-1     : 55697a4
Commit timestamp : 2022-12-16 23:03:35 +0000
Branch           : release/v22.0.2
Go version       : go1.18.5
jemalloc enabled : true

I have confirmed this on the latest master branch as well:

Dgraph version   : v23.0.0-beta1-7-g2fc9a99fa
Dgraph codename  : dgraph
Dgraph SHA-256   : d6ad9375ea69f7188c76039c7004b9592d11cb11fb2095d45675c7319a5b26eb
Commit SHA-1     : 2fc9a99fa
Commit timestamp : 2023-03-05 05:41:58 +0530
Branch           : main
Go version       : go1.19.4
jemalloc enabled : true

Tell us a little more about your go-environment?

go version go1.20 linux/amd64

Have you tried reproducing the issue with the latest release?

Yes

What is the hardware spec (RAM, CPU, OS)?

What steps will reproduce the bug?

$ git clone --depth 1 https://github.com/dgraph-io/dgo.git -b v2.1.0                                                                                                                                  
$ grpcurl -plaintext -import-path dgo/protos -proto api.proto \
-d '{"query":"query q($name:string){ q(func: regexp(dgraph.type,$name)) {}}", "vars": {"$name": "/"}}' \
dgraph:9080 api.Dgraph/Query
ERROR:
  Code: Unavailable
  Message: error reading from server: EOF

Expected behavior and actual result.

The expected behavior is to return a parse error. The actual result is crash of dgraph alpha server.

Additional information

logs during the crash:

panic: runtime error: slice bounds out of range [1:0]

goroutine 260 [running]:
github.com/dgraph-io/dgraph/dql.parseRegexArgs({0x2df7658, 0x1})
    /tmp/dgraph/dql/parser.go:1640 +0x165
github.com/dgraph-io/dgraph/dql.regExpVariableFilter(0xc0001a5e00, 0x0)
    /tmp/dgraph/dql/parser.go:457 +0x4d
github.com/dgraph-io/dgraph/dql.substituteVariables(0xc00037fa40, 0x0?)
    /tmp/dgraph/dql/parser.go:398 +0x8ec
github.com/dgraph-io/dgraph/dql.ParseWithNeedVars({{0xc0019b1380?, 0xc0019a6840?}, 0xc001998f00?}, {0x0, 0x0, 0x0})
    /tmp/dgraph/dql/parser.go:632 +0xf0d
github.com/dgraph-io/dgraph/edgraph.parseRequest(0xc0019a6840)
    /tmp/dgraph/edgraph/server.go:1566 +0x40d
github.com/dgraph-io/dgraph/edgraph.(*Server).doQuery(0x22fde28?, {0x22fde28, 0xc001999350}, 0xc0019d9878)
    /tmp/dgraph/edgraph/server.go:1314 +0x992
github.com/dgraph-io/dgraph/edgraph.(*Server).QueryNoGrpc(0x22f6c90?, {0x22fde28?, 0xc001998a20?}, 0xc000e6fb60)
    /tmp/dgraph/edgraph/server.go:1216 +0x2f9
github.com/dgraph-io/dgraph/edgraph.(*Server).Query(0x20?, {0x22fde28, 0xc001998a20}, 0x0?)
    /tmp/dgraph/edgraph/server.go:1182 +0x3b
github.com/dgraph-io/dgo/v210/protos/api._Dgraph_Query_Handler.func1({0x22fde28, 0xc001998a20}, {0x1f5ab60?, 0xc000e6fb60})
    /go/pkg/mod/github.com/dgraph-io/dgo/v210@v210.0.0-20210407152819-261d1c2a6987/protos/api/api.pb.go:1693 +0x78
github.com/dgraph-io/dgraph/ee/audit.AuditRequestGRPC({0x22fde28?, 0xc001998a20?}, {0x1f5ab60?, 0xc000e6fb60?}, 0xc0002dea20?, 0x1dcc180?)
    /tmp/dgraph/ee/audit/interceptor_ee.go:79 +0xe7
github.com/dgraph-io/dgo/v210/protos/api._Dgraph_Query_Handler({0x1f23720?, 0x31fd358}, {0x22fde28, 0xc001998a20}, 0xc0019b4e00, 0x206f250)
    /go/pkg/mod/github.com/dgraph-io/dgo/v210@v210.0.0-20210407152819-261d1c2a6987/protos/api/api.pb.go:1695 +0x138
google.golang.org/grpc.(*Server).processUnaryRPC(0xc000114f00, {0x2306c40, 0xc001ac1a00}, 0xc0010f7d40, 0xc000159080, 0x2dae2b8, 0x0)
    /go/pkg/mod/google.golang.org/grpc@v1.52.0/server.go:1336 +0xd23
google.golang.org/grpc.(*Server).handleStream(0xc000114f00, {0x2306c40, 0xc001ac1a00}, 0xc0010f7d40, 0x0)
    /go/pkg/mod/google.golang.org/grpc@v1.52.0/server.go:1704 +0xa2f
google.golang.org/grpc.(*Server).serveStreams.func1.2()
    /go/pkg/mod/google.golang.org/grpc@v1.52.0/server.go:965 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
    /go/pkg/mod/google.golang.org/grpc@v1.52.0/server.go:963 +0x28a
[Sentry] 2023/03/06 11:37:31 Sending fatal event [2d84c90d17b34d66b33e41d70a109a48] to o318308.ingest.sentry.io project: 5208688
[Sentry] 2023/03/06 11:37:31 Buffer flushed successfully.
MichelDiz commented 1 year ago

@jx6f that panic killed your cluster? if not, it is not a big issue. But we have to find a way to give a better error for an invalid query.

I tried in my end and the panic happens, but I had no crash. The error is normal. Also, the parser is checking for double / so that is why it panics.

jx6f commented 1 year ago

@MichelDiz Thank you for your responce.

Yes, that panic killed my cluster. To reproduce this problem, gRPC is required. If you tried with JSON, the query will only cause a panic.

Here is another reproduce example, this may easier to confirm.

package main

import (
    "context"
    "log"

    "github.com/dgraph-io/dgo/v210"
    "github.com/dgraph-io/dgo/v210/protos/api"
    "google.golang.org/grpc"
    "google.golang.org/grpc/credentials/insecure"
)

var dgraph = "127.0.0.1:9080"

func main() {
    d, err := grpc.Dial(dgraph, grpc.WithTransportCredentials(insecure.NewCredentials()))
    if err != nil {
        log.Fatal(err)
    }

    dc := dgo.NewDgraphClient(api.NewDgraphClient(d))
    query := `
query q($name:string){
    q(func: regexp(dgraph.type,$name)) {
}}`
    vars := map[string]string{"$name": "/"}
    resp, err := dc.NewReadOnlyTxn().QueryWithVars(context.Background(), query, vars)
    if err != nil {
        log.Fatal(err)
    }

    log.Printf("%s\n", string(resp.Json))
}
MichelDiz commented 1 year ago

@jx6f thank you for the code. Yep, via gRPC it crashes the Alpha.

@mangalaman93, not sure if your change will trap that. I have tested via HTTP, not gRPC. But we have a complete crash via gRPC. Not sure why it goes direct to full crash instead of doing just a panic.

The error from the client is

rpc error: code = Unavailable desc = error reading from server: EOF

End of file error oO

update:

this

if len(val) < 2 || end == 0 {

Will solve the issue, but no idea about the diferences in the panic in HTTP and crash in gRPC. This is odd.

The gRPC panic happens in expr := strings.Replace(val[1:end], "\\/", "/", -1)

mangalaman93 commented 1 year ago

I think this PR will solve problems for both HTTP and gRPC

jx6f commented 1 year ago

The Go http package does not seem to crash during panic because it calls recover.

https://github.com/golang/go/blob/71c84d4b4149bebc2abcc495ef744e1a010a18e7/src/net/http/server.go#L1860

mangalaman93 commented 1 year ago

I don't think it even panics in HTTP case. Do you have an example of the panic?

MichelDiz commented 1 year ago

It panics in HTTP. My first atempt to reproduce was using Main branch and Ratel. So it panics, but there's no crash. That's why I thought it wasn't a big issue. Once I tested the jx6f code using gRPC I saw the panic with crash. Which I found very strange, as in one case it crashes and in the other it doesn't.

mangalaman93 commented 1 year ago

Could you help me reproduce it @MichelDiz?

MichelDiz commented 1 year ago

I did nothing especial.

1 - Run Dgraph in debug mode. 2 - Execute the mentioned query via Ratel. And that's it.

But maybe it is nothing, if your checking code trap all those, it is fine.

jx6f commented 1 year ago

To recover from panic when using gRPC, grpc_recovery seems to be useful.

https://github.com/grpc-ecosystem/go-grpc-middleware#server

In the following example I have confirmed that an error is returned instead of a panic(crash):

diff --git a/dgraph/cmd/alpha/run.go b/dgraph/cmd/alpha/run.go
index 9136fe3..aa7bf5f 100644
--- a/dgraph/cmd/alpha/run.go
+++ b/dgraph/cmd/alpha/run.go
@@ -43,10 +43,12 @@ import (
    "go.opencensus.io/zpages"
    "golang.org/x/net/trace"
    "google.golang.org/grpc"
+   "google.golang.org/grpc/codes"
    "google.golang.org/grpc/credentials"
    _ "google.golang.org/grpc/encoding/gzip" // grpc compression
    "google.golang.org/grpc/health"
    hapi "google.golang.org/grpc/health/grpc_health_v1"
+   "google.golang.org/grpc/status"

    "github.com/dgraph-io/badger/v4"
    "github.com/dgraph-io/dgo/v210/protos/api"
@@ -62,6 +64,8 @@ import (
    "github.com/dgraph-io/dgraph/x"
    _ "github.com/dgraph-io/gqlparser/v2/validator/rules" // make gql validator init() all rules
    "github.com/dgraph-io/ristretto/z"
+   grpc_middleware "github.com/grpc-ecosystem/go-grpc-middleware"
+   grpc_recovery "github.com/grpc-ecosystem/go-grpc-middleware/recovery"
 )

 var (
@@ -446,7 +450,16 @@ func serveGRPC(l net.Listener, tlsCfg *tls.Config, closer *z.Closer) {
        grpc.MaxSendMsgSize(x.GrpcMaxSize),
        grpc.MaxConcurrentStreams(1000),
        grpc.StatsHandler(&ocgrpc.ServerHandler{}),
-       grpc.UnaryInterceptor(audit.AuditRequestGRPC),
+       grpc.UnaryInterceptor(
+           grpc_middleware.ChainUnaryServer(
+               audit.AuditRequestGRPC,
+               grpc_recovery.UnaryServerInterceptor(
+                   grpc_recovery.WithRecoveryHandler(func(p interface{}) error {
+                       glog.Infof("gRPC panic recovered: %v\n", p)
+                       return status.Errorf(codes.Unknown, "panic triggered: %v", p)
+                   }),
+               ),
+           )),
    }
    if tlsCfg != nil {
        opt = append(opt, grpc.Creds(credentials.NewTLS(tlsCfg)))
diff --git a/go.mod b/go.mod
index 2d5d19f..69ab712 100644
--- a/go.mod
+++ b/go.mod
@@ -36,6 +36,7 @@ require (
    github.com/google/uuid v1.3.0
    github.com/gorilla/websocket v1.4.2
    github.com/graph-gophers/graphql-go v1.3.0
+   github.com/grpc-ecosystem/go-grpc-middleware v1.0.0
    github.com/hashicorp/vault/api v1.0.4
    github.com/minio/minio-go/v6 v6.0.55
    github.com/mitchellh/panicwrap v1.0.0
diff --git a/go.sum b/go.sum
index 9f179cb..7fd1275 100644
--- a/go.sum
+++ b/go.sum
@@ -342,6 +342,7 @@ github.com/graph-gophers/graphql-go v1.3.0 h1:Eb9x/q6MFpCLz7jBCiP/WTxjSDrYLR1QY4
 github.com/graph-gophers/graphql-go v1.3.0/go.mod h1:9CQHMSxwO4MprSdzoIEobiHpoLtHm77vfxsvsIN5Vuc=
 github.com/graph-gophers/graphql-transport-ws v0.0.2 h1:DbmSkbIGzj8SvHei6n8Mh9eLQin8PtA8xY9eCzjRpvo=
 github.com/graph-gophers/graphql-transport-ws v0.0.2/go.mod h1:5BVKvFzOd2BalVIBFfnfmHjpJi/MZ5rOj8G55mXvZ8g=
+github.com/grpc-ecosystem/go-grpc-middleware v1.0.0 h1:Iju5GlWwrvL6UBg4zJJt3btmonfrMlCDdsejg4CZE7c=
 github.com/grpc-ecosystem/go-grpc-middleware v1.0.0/go.mod h1:FiyG127CGDf3tlThmgyCl78X/SZQqEOJBCDaAfeWzPs=
 github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0/go.mod h1:8NvIoxWQoOIhqOTXgfV/d3M/q6VIi02HzZEHgUlZvzk=
 github.com/grpc-ecosystem/grpc-gateway v1.4.1/go.mod h1:RSKVYQBd5MCa4OVpNdGskqpgL2+G+NZTnrVHpWWfpdw=