bblfsh / sdk

Babelfish driver SDK
GNU General Public License v3.0
23 stars 27 forks source link

.filter() - Invalid memory address or nil pointer dereference #424

Closed rick2600 closed 5 years ago

rick2600 commented 5 years ago

I found by accident an invalid memory access.

poc.py

import bblfsh

client = bblfsh.BblfshClient("localhost:9432")
ctx = client.parse("poc.php")
ctx.filter("//*[@role='Variable']*//Name")

poc.php Any php code apparently

stacktrace

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7f82fa1f781e]

goroutine 17 [running, locked to thread]:
github.com/antchfx/xpath.(*Expr).Evaluate(0xc00000e580, 0x7f82fa590380, 0xc00000e5a0, 0x10, 0x7f82fa54efc0)
    /home/travis/gopath/pkg/mod/github.com/antchfx/xpath@v0.0.0-20180922041825-3de91f3991a1/xpath.go:120 +0x6e
github.com/bblfsh/sdk/v3/uast/query/xpath.(*xQuery).Execute(0xc00004ab80, 0x7f82fa58d5c0, 0xc00006b140, 0x7f82fa589320, 0xc00004ab80, 0x0, 0x0)
    /home/travis/gopath/pkg/mod/github.com/bblfsh/sdk/v3@v3.0.0/uast/query/xpath/xpath.go:45 +0xd4
github.com/bblfsh/sdk/v3/uast/query/xpath.(*index).Execute(0x7f82fa776028, 0x7f82fa58d5c0, 0xc00006b140, 0xc0000146e0, 0x1c, 0x7f82fa070167, 0xc0000146e0, 0x27a3da0, 0x1c)
    /home/travis/gopath/pkg/mod/github.com/bblfsh/sdk/v3@v3.0.0/uast/query/xpath/xpath.go:35 +0x9e
main.(*Context).Filter(0xc000058240, 0x7f82fa58e4c0, 0xc00006b140, 0xc0000146e0, 0x1c, 0xc000046e70, 0x7f82fa056b31, 0xc000000008)
    /home/travis/gopath/src/github.com/bblfsh/libuast/src/uast.go:268 +0xb7
main.UastFilter(0x27a2d70, 0x1, 0x27a3da0, 0x7f82fa584b58)
    /home/travis/gopath/src/github.com/bblfsh/libuast/src/api.go:217 +0xc1
main._cgoexpwrap_d69b59b6893f_UastFilter(0x27a2d70, 0x1, 0x27a3da0, 0x0)
    _cgo_gotypes.go:548 +0x74
Aborted (core dumped)
ncordon commented 5 years ago

Thanks for your issue @rick2600. We will address this as soon as possible

ncordon commented 5 years ago

Hi @rick2600. The part "//*[@role='Variable']*//Name" had an spare asterisk *. I suggest you use this query instead for your purpose, please:

import bblfsh

client = bblfsh.BblfshClient("localhost:9432")
ctx = client.parse("poc.php")
ctx.filter("//*[@role='Variable']//Name")

Certainly we should find user-friendlier ways to tell that the written query is not syntactically correct :wink: If you want to extract all the variable names used, you can do so further iterating over the result of the filter:

import bblfsh

client = bblfsh.BblfshClient("localhost:9432")
ctx = client.parse("poc.php")
it = ctx.filter("//*[@role='Variable']//Name")

for name in it:
  print(name)
ncordon commented 5 years ago

With respect to the awful panic failure, we will further investigate it, because the Python interpreter should not die after executing that.

Here's a reproducible example (using a file from AWS sdk):

curl https://raw.githubusercontent.com/aws/aws-sdk-php/master/src/Handler/GuzzleV5/PsrStream.php -o file.php
import bblfsh

client = bblfsh.BblfshClient("localhost:9432")
ctx = client.parse("./file.php")
ctx.filter("//*[@role='Variable']*//Name")
rick2600 commented 5 years ago

Hi @ncordon thank you. Yes, the problem is caused by this asterix, the query is not correct but I decided to report because of this SIGSEGV.

ncordon commented 5 years ago

I am transferring this issue to the sdk