Open nizar-m opened 4 years ago
It would be great if we could compare performance against master for queries that don't involve remote joins. What would be the best way to accomplish that? Can this PR work against master?
Ah I also noticed on my machine this creates directories that have weird owenership:
total 16552
-rw-r--r-- 1 me me 2797 Oct 10 12:15 hge.log
-rw-r--r-- 1 me me 3181 Oct 10 12:15 remote_hge.log
drwx------ 19 70 root 4096 Oct 10 12:15 remote_sportsdb_data
-rw-r--r-- 1 me me 675840 Oct 10 11:32 sportsdb_cache.sqlite
drwx------ 19 70 root 4096 Oct 10 12:15 sportsdb_data
-rw-r--r-- 1 me me 15597968 Oct 10 11:32 sportsdb_sample_postgresql_20080304.sql
-rw-r--r-- 1 me me 652509 Oct 10 11:32 sportsdb.zip
This causes problems for tooling like hasktags or ack, since they can't be traversed.
When I run the final benchmarking step I get this:
==================== benchmark: events_remote_affilications -------------------- candidate: events_remote_affiliations on hge-with-remote at http://127.0.0.1:8081/v1/graphql Warmup: ++++++++++++++++++++ 20Req/s Duration:60s open connections:20 unable to connect to 127.0.0.1:8081 Connection refused ++++++++++++++++++++ 40Req/s Duration:60s open connections:20 unable to connect to 127.0.0.1:8081 Connection refused Benchmark: ++++++++++++++++++++ 20Req/s Duration:300s open connections:20 unable to connect to 127.0.0.1:8081 Connection refused ++++++++++++++++++++ 40Req/s Duration:300s open connections:20 unable to connect to 127.0.0.1:8081 Connection refused * Serving Flask app "bench" (lazy loading) * Environment: production WARNING: Do not use the development server in a production environment. Use a production WSGI server instead. * Debug mode: off * Running on http://0.0.0.0:8050/ (Press CTRL+C to quit)
And visiting
http://0.0.0.0:8050/
shows an empty graph
The connection from docker to a localhost application will work only when the docker is running with --net=host
. I have changed the readme to reflect this. (This may not work on mac though).
Ah I also noticed on my machine this creates directories that have weird owenership:
total 16552 -rw-r--r-- 1 me me 2797 Oct 10 12:15 hge.log -rw-r--r-- 1 me me 3181 Oct 10 12:15 remote_hge.log drwx------ 19 70 root 4096 Oct 10 12:15 remote_sportsdb_data -rw-r--r-- 1 me me 675840 Oct 10 11:32 sportsdb_cache.sqlite drwx------ 19 70 root 4096 Oct 10 12:15 sportsdb_data -rw-r--r-- 1 me me 15597968 Oct 10 11:32 sportsdb_sample_postgresql_20080304.sql -rw-r--r-- 1 me me 652509 Oct 10 11:32 sportsdb.zip
This causes problems for tooling like hasktags or ack, since they can't be traversed.
You can now specify the directory where all these files should be present. If you use the same work directory the second time, bringing up the test setup would be much faster. Instead of doing the full setup, we would simple reuse the Postgres data directories sportsdb_data
and remote_sportsdb_data
(these directories are bind mounted to Postgres dockers).
Probably we can run the master using the corresponding docker image, and then run benchmarks.
So I guess we need to make the following comparisions:
Master vs remote relationship branch for the normal queries
Query with table object/array relationship vs with remote object/array relationship
Do we have other comparisons to make?
Ah I also noticed on my machine this creates directories that have weird owenership:
total 16552 -rw-r--r-- 1 me me 2797 Oct 10 12:15 hge.log -rw-r--r-- 1 me me 3181 Oct 10 12:15 remote_hge.log drwx------ 19 70 root 4096 Oct 10 12:15 remote_sportsdb_data -rw-r--r-- 1 me me 675840 Oct 10 11:32 sportsdb_cache.sqlite drwx------ 19 70 root 4096 Oct 10 12:15 sportsdb_data -rw-r--r-- 1 me me 15597968 Oct 10 11:32 sportsdb_sample_postgresql_20080304.sql -rw-r--r-- 1 me me 652509 Oct 10 11:32 sportsdb.zip
This causes problems for tooling like hasktags or ack, since they can't be traversed.
You can now specify the directory where all these files should be present. If you use the same work directory the second time, bringing up the test setup would be much faster. Instead of doing the full setup, we would simple reuse the Postgres data directories
sportsdb_data
andremote_sportsdb_data
(these directories are bind mounted to Postgres dockers).
That's an okay workaround. It would be nicer if they lived in the tree but not owned by root. But it looks like this is a pain to do; not sure:
https://gist.github.com/nitrobin/4d16fbe347c150a422ad https://github.com/moby/moby/issues/2259
I was able to try again and everything went smoothly I think, following your instructions!
Random thoughts mostly for when I get a chance to integrate this with my dev.sh
script, and mostly concerning improvements to https://github.com/hasura/graphql-bench
rps
setting...stack build
(since we may be building with profiling, etc.)candidates
it looks like some get dropped from the graph at higher RPS, though I don't see obvious errors reportedWould you mind adding a link to the included queries.graphql
in the readme, and mentioning how one selects the queries with query:
in the YAML? It's pretty self-explanatory once you know what files to look at but would be helpful. Maybe this belongs in the docs for https://github.com/hasura/graphql-bench instead
Also it would be really awesome if you could include a bunch of interesting queries in queries.graphql
with comments. e.g. it would be nice to have some that:
Obviously we can all iterate on this and contribute queries as we go, and this will be an ongoing project
Also mention in README, something like: "The graphql-engine log file is located in ./server/tests-py/remote_relationship_tests/test_output/hge.log
and persists between runs"
Another thing we should improve: abort if the stack build fails (else we will continue on with the wrong version)
Description
The python script
test_with_sportsdb.py
script should setup the databases and graphql engines required for the tests.Affected components
Related Issues
Solution and Design
Steps to test and verify
Limitations, known bugs & workarounds