Closed vmarkovtsev closed 7 years ago
Actually, we seem to reproduce it even with a single process. If I add time.sleep()
between requests, they become valid and 100% the same.
Another observation: we did not test it with a single instance of the driver. We will test it now.
This is the log we get. I see errors with empty __init__.py
, though not sure if they are relevant. Besides, the environment var Vadim added in https://github.com/bblfsh/server/pull/25 seem to not work.
@juanjux BTW those errors on empty files do not look like gracefully handled :)
Does it only happen with all these files or still happens with a smaller number of files? If the latter, could you provide an example with the minimum number of files and those files to try to reproduce the problem?
This may or may not be related with https://github.com/bblfsh/java-driver/issues/19 EDIT: It's not
We will try to reduce it now.
We tortured it to death!
status: FATAL
errors: "timedout: all drivers are busy"
status: FATAL
errors: "timedout: all drivers are busy"
This is the minimal example which reproduces the problem in 50% of the cases:
python3 -m bblfsh ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/uast/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 1.txt &
python3 -m bblfsh ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/uast/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 2.txt &
python3 -m bblfsh ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/uast/__init__.py ast2vec/src/bblfsh/bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 3.txt &
I'll try to reproduce and fix this tomorrow as I return from CurryOn.
Sorry for the delay.
Not the same problem (but could be related), I'm seeing that parsing those init.py files directly from the driver (without the server) produce a fatal error "missing content" (since the files are empty) which should be a normal "error" (with an empty AST returned) and not a fatal (since fatal errors stop any further processing from that driver and this isn't really anything unrecoverable). I'll start by fixing that and checking if that fix the problem.
This could also be related with the blocking problem because we suspect the server is not closing failing drivers correctly and thus the limit of live images is reached, but that will come later.
Update: this bug seems to be a several-headed beast, I'll write and keep updated a list of found problems so you can see the status from this message. Maybe later I'll split this into several different issues.
Since I can't reproduce this problem anymore after the first fixes I'll close this one. I'll publish new docker images of the server and python-driver soon but meanwhile you can test with a server and python-driver build from ~master. Build the python driver first:
python-driver$ make build
then tag the created image (docker tag
) as bblfsh/python-driver:latest
. Then build and run the server with:
server$ cd cmd/bblfsh
bblfsh$ go build
bblfsh$ sudo rm -rf /tmp/bblfsh-runtime
bblfsh$ sudo ./bblfsh server --transport=docker-daemon
Then you can run the python-client (remember to use the --disable-bblfsh-autorun
option to avoid spawning a new server).
BTW, the last version of the python-client seem to work a little different than the way it was explained in the report, now it has a -f
parameter for specifying the file but if you pass it several times it only take the last file (bug?) so to reproduce this problem with that version I did a script with this content:
#!/bin/sh
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/__init__.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/protocol/__init__.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/protocol/generated_pb2.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/protocol/generated_pb2_grpc.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/uast/__init__.py
python3 -m bblfsh --disable-bblfsh-autorun -l python -f sdk/uast/generated_pb2.py
and then called it from this one:
#!/bin/sh
sh bugtest.sh > 1.txt&
sh bugtest.sh > 2.txt&
sh bugtest.sh > 3.txt&
sh bugtest.sh > 4.txt&
sh bugtest.sh > 5.txt&
@juanjux from the first message
Please note that you need the patch to client-python: https://github.com/vmarkovtsev/bblfsh.client-python/tree/main-change
@juanjux The bug is not reproduced the way your script works. We knew that, that's why we created the patch. Please reopen this and let us test your fixes instead of opening the second issue about the same problems.
Reopened. I saw your comment and downloaded from that repo but failed to notice it was a specific branch, sorry about that.
Anyway I'm testing now with the right branch and the same script (plus the --disable-bblfsh-autorun parameter to use my local updated server) and I'm always getting the same result. Right now I'm writing a script to do it some hundreds times before I go to lunch to be 100% safe.
Sorry for closing the bug report before your test, it's pretty common on github to do it as soon as the developer considers the bug fixed and reopening if the user says its not, but I certainly didn't saw anything on that respect on our git workflow document so I could have acted too fast dumbed down by the happy feeling of closing this f*cking bug that I've been battling for a day and a half :grinning:
This will be running in the background for a while (run with full bash
, not sh
). For the moment it didn't detect any changes on 60 iterations. I'm also helping Egor test the fix via Slack.
#!/bin/bash
for i in `seq 1 500`;
do
python3 -m bblfsh --disable-bblfsh-autorun bblfsh/github/com/bblfsh/sdk/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py bblfsh/github/com/bblfsh/sdk/uast/__init__.py bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 1.txt &
python3 -m bblfsh --disable-bblfsh-autorun bblfsh/github/com/bblfsh/sdk/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py bblfsh/github/com/bblfsh/sdk/uast/__init__.py bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 2.txt &
python3 -m bblfsh --disable-bblfsh-autorun bblfsh/github/com/bblfsh/sdk/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/__init__.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2.py bblfsh/github/com/bblfsh/sdk/protocol/generated_pb2_grpc.py bblfsh/github/com/bblfsh/sdk/uast/__init__.py bblfsh/github/com/bblfsh/sdk/uast/generated_pb2.py > 3.txt &
sleep 5
diff 1.txt 2.txt > /dev/null
D1=$?
diff 2.txt 3.txt > /dev/null
D2=$?
if [ "$D1" == 1 ] || [ "$D2" == 1 ]; then
echo "Different files"
exit 1
fi
rm 1.txt 2.txt 3.txt
echo "Finished iteration $i"
done
Worked 500 times. I'll make a Docker image so Egor can test this without much hassle.
Thanks! Tomorrow ML has OSF, so Egor will be able to test on Monday.
Ok Vadim, as I spoke with Egor, I'll prepare a Dockerfile that downloads and compiles all components in ~master so this and other bugfixes are easier to test by you both before releasing new versions of the docker images on dockerhub with the bugfixes.
I created a project with a Dockerfile and a test script here: https://github.com/juanjux/bblfsh-dev-image
For testing:
git clone https://github.com/juanjux/bblfsh-dev-image.git
cd bblfsh-dev-image
docker build -t bblfshdev .
# wait for the image to build...
docker run -v $(pwd):/script --privileged -it --rm bblfshdev bash -l /script/test_server_issue34.sh
# wait for the 500 iterations (* 5 seconds) of the test finish...
# ...make whatever other tests you need...
Several Bothan agents died to create this Dockerfile + script and made it compatible with Ubuntu 16.04.
Thank you @juanjux!
Update:
Thanks for an awesome effort on a containerizing build for bblfsh!
I must be doing something wrong here, but for some reason su prebblfsh -c "make build"
step fails with error: [Errno 13] Permission denied: '/opt/driver/src/.local'
and at this point it's hard had to say whcih container filesystem are we talking about (as in host container this file does not exist).
Found the same problem tonight actually, should be fixed on the current version of the docker image project. That command is run inside the docker of the Python driver (the docker inside the docker). The problem was an UID collision that was fixed adding the BUILD_UID env variable hardcoding the UID to the one the previously created prebblfsh user has (1000) and then giving permission to that UID for the drivers directories (that are mounted in /opt/driver
inside the driver's dockers).
🚀 Builds like a charm now - both, python and java driver were built. Thank you for prompt fixes!
Update if anybody is interested - I put together the instructions to build the same on Debian Jessie, so you can do build latest bblfsh i.e on free Google Cloud Console
sudo apt-get install libglib2.0-0=2.42.1-1+b1
sudo apt-get install -y pkg-config libglib2.0-dev libostree-dev
go get github.com/bblfsh/sdk/... && \
go get github.com/bblfsh/python-driver/... && \
go get github.com/bblfsh/java-driver/... && \
go get github.com/bblfsh/server/...
cd ~/gopath/src/github.com/bblfsh/server/cmd/bblfsh
cd ../../../python-driver/
bblfsh-sdk prepare-build && bblfsh-sdk update && make build
cd ../java-driver
bblfsh-sdk prepare-build && bblfsh-sdk update && make build
@bzz yep, most of the complications in the Dockerfile is because of dependencies (and dependencies of dependencies) that are missing in Ubuntu 16.04, in 16.10+ is much simpler.
Feel free to reopen if it's not fixed.
@EgorBu I don't see your test report here. Is everything's fixed for us?
Not everything is well.
During the first iteration, it returns constantly the wrong result: 4.0K instead of 3.3M
I think it may be related to https://github.com/bblfsh/server/issues/37
Testing for 10 iterations with a sleep of 5 seconds
4.0K 1.txt 4.0K 2.txt 4.0K 3.txt 8.0K requirements.txt
Finished iteration 1
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 2
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 3
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 4
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 5
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 6
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 7
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 8
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 9
3.3M 1.txt 3.3M 2.txt 3.3M 3.txt 8.0K requirements.txt
uast { internal_type: "Module" start_position { line: 1 col: 1 } roles: OP_BOOLEAN_AND } uast {
Finished iteration 10
@EgorBu as I was telling you in Slack, it's a false positive because of the 5 seconds sleep is not enough for the driver instantiation, so the 3 files are still empty (bash file redirections create the files even before content is generated on stdout). Changing it to 20 seconds that iteration eliminates the problem 100% of the time.
We use the following script to reproduce the problem:
Please note that you need the patch to client-python: https://github.com/vmarkovtsev/bblfsh.client-python/tree/main-change
Result: 1.txt, 2.txt and 3.txt are randomly different (checked with
diff
). Besides, some files are in wrong places or empty.