Open krish7919 opened 2 years ago
I see that @dosco mentions in his talk at https://www.youtube.com/watch?v=gzAiAbsCMVA&t=1396s that this runs on a serverless platform. Is there a doc on how this is made performant and set up? Is all we need to do is list out all the allowed queries and that's it? How does the list help on a serverless platform as the stored procedures need to be created accordingly?
GraphJin is used in severless environments like Google Cloud Run, Cloud Functions and App Engine all of which are scale to zero services. GraphJin should work fine in any functions as a service environment since database discovery is a single very fast query and GraphJin is designed to be up and running in under 20ms which ads almost zero overhead to any function call.
Is there anyway to avoid the initial single fast query
. This might be an issue for bigger schemas; and might avoid the one query to the db for making APIs respond faster.
GraphJin is used in severless environments like Google Cloud Run, Cloud Functions and App Engine all of which are scale to zero services. GraphJin should work fine in any functions as a service environment since database discovery is a single very fast query and GraphJin is designed to be up and running in under 20ms which ads almost zero overhead to any function call.
@dosco is there only docs on this setup ? I really want to try this out
I need to check the wiki but every new GraphJin project contains a cloudbuild.yml
to build and deploy GraphJin on CloudRun. Its just a matter or deploying the Docker image with the right environment vars set. GJ_DATABASE_HOST, GJ_DATABASE_USER and GJ_DATABASE_PASSWORD
@krish7919 Not at the moment all of GraphJin works off this database discovery. Unless your using cloud function that are rarely called (run from cold start every time) it won't be much of an issue.
thanks for the tips !! This is really cool
@dosco
I tried to write a small PoC using graphjin library, and pointed it to my currrent tables. It fails with the error panic: postgres: foreign key column not found: X.Y
when calling NewDBSchema
in the _initSchema()
method. So not quite a drop in for the the current postgraphile
solution we use.
I tried to write a small PoC using graphjin library, and pointed it to my currrent tables. It fails with the error
panic: postgres: foreign key column not found: X.Y
when callingNewDBSchema
in the_initSchema()
method. So not quite a drop in the the currentpostgraphile
solution we use.
Definitly want to fix this issue for you. Can you give me the following details.
Added a fix for this 01d54ca2edb1f0051fd7dd8e324239e773e75054 @krish7919 can you please test and confirm.
To answer your question, @dosco:
Are you using multiple schemas? Is this a foreign key relationship between tables in two different schemas?
No, just a single one. As a matter of fact, only the public
schema.
Is there anything special about this foreign key column (some special postgres feature)?
Nothing special here.
Actually, we can get rid of foreign key checks in Graphjin if there was an option to, and move ahead, but it doesn't seem to be the case today (or I couldn't find it).
@krish7919 can you confirm if the above fix worked for you?
@dosco I think you are confusing one issue with another. The error I have mentioned above still exists, its a Graphjin library logic error. The fix you have done performs an early return; I had already made that fix locally and the error message mentioned above is what happens with my database schema. And it works perfectly fine with postgraphile
.
The early return on error is fixed and it is fine to close that issue; this issue should not be as Graphjin fails in scenarios where postgraphile
doesn't.
The commit has other changes as well in addition to including that err check I had previously missed. The schema discovery now has ordering. The previous random processing order of columns might have also been a reason for this issue. I have also updated the test which helped me recreate your error and the fix makes the test pass. If you could try the latest code and see if the issue still exists for you that would be helpful. GraphJin has nothing to do with postgrphile these are entirely different codebases.
My go.mod
looks like so now:
go 1.19
require (
github.com/dosco/graphjin v0.20.33-0.20221007075649-01d54ca2edb1
github.com/jmoiron/sqlx v1.3.5
github.com/lib/pq v1.10.7
)
And the same error persists, although the ordering persists.
I do not know much about our db schemas at present. However, considering the schema is consistent and postgraphile
doesn't complain, I find it odd that Graphjin complains about the FK error mentioned above.
Just to be double sure you are using the right commit id I pushed a new version update could you retry but first go get the latest
go get github.com/dosco/graphjin@latest
$ head -n10 go.mod
module /path/...
go 1.19
require (
github.com/dosco/graphjin v0.20.33
github.com/jmoiron/sqlx v1.3.5
github.com/lib/pq v1.10.7
)
$ go run main.go
panic: postgres: foreign key column not found: X.Y
goroutine 1 [running]:
main.main()
/path/x/y/z/main.go:93 +0x235
exit status 2
If it matters, we are at Postgres v10
I dumped the columns in c := t.Columns[i]
and foreign table (?) in ft := s.tables[v.nodeID]
and I can confirm that the columns c
is present in ft
but the getColumn
does not return it.
EDIT 1:
This happens because DBTable.colMap
does not have the column that we are looking for.
EDIT 2:
And this is because the columns have different casing. For example, I have the column in DBTable.colMap
that is traitid
, but the actual column name is traitId
. Where does this case diff come from?
EDIT 3:
It seems to come from this line in code at https://github.com/dosco/graphjin/blob/master/core/internal/sdata/tables.go#L169:
ti.colMap[strings.ToLower(c.Name)] = i
EDIT 4:
I removed the ToLower
condition and it moves ahead and NewGraphJin()
completes fine. Next issue is when I call GraphQL()
, it is looking for a table for the query I run, but not at the Postgres function created for the corresponding Graphql query.
Thanks for this I need to add a test with mixed-case names. I'll have a fix out.
As a summary, I am trying to comb through the library now to resolve why my query does not find the correct DB function to call.. :)
I am trying to understand a bit, but it spews the error table not found:
from Find()
instead of trying to find the function name.
It doesn't seem that SQL Functions are handled in graphjin as is; I see some commented and an early exit with error when a table name corresponding to the query is not found; rather than continuing and checking for function names (or vice versa, check for functions first and then query).
but not at the Postgres function created for the corresponding Graphql query.
Can you explain this further. We don't currently support functions in GraphQL selectors only tables. Functions can only be used for aggregation. For example to count users by the id field. Internally this converts to count(id). I'm open to adding support for your usecase.
users {
count_id
}
Yes, this should have been documented, or if it has been, I missed it.
We use functions exclusively so that we can refactor them in case the table schema cannot be immediately updated/changed.
Thanks for considering adding support for this.
Can you share an example of a function that returns a table it's not a postgres feature that I have used personally. I'm guessing it wont be too hard to add.
Every function can return a table, ref https://www.postgresql.org/docs/current/sql-createfunction.html
EDIT:
Some of our functions also return a boolean, or a set or other return types, as mentioned in https://www.postgresql.org/docs/current/xfunc-sql.html
@krish7919 cool thanks i'll look into this today. I hope the latest update fixes the column not found issue for you. There were several places where things were lowercased remenants from when we were attempting to make things case insenitive. Its all case-sensitive now I guess your mixed case column names helped fix it.
The column issue is fixed. Thanks for looking into the implementing support for functions. 👍
I still think the naming convention is a tad off compared to postgraphile
library. For example, we have something like x_v_1_2_y_z
which gets converted to x_v_12_y_x
in grpahjin
(based on the EnableCamelCase
option), but is converted to something differnet in postgraphile
. This should be a minor though and I will list all the cases here once I can get something to run with the functions.
Can you share the exact camel case for x_v_1_2_y_z
in postgraphile. I'm happy to update our function to match the behaviour.
As 1 example, the graphql quey getVersionV100StatsForAllDatapoints
is mapped to the SQL function get_version_v1_0_0_stats_for_all_datapoints
in Postgres. However, there are many others and I could not enumerate all of them right now. However, most are of the form mentioned above.
I do know that there are some corner cases (for example something like iphone_6e
would change to iphone6E
, instead of iphone6e
), but that's a remnant of moving from postgraphql
to postgraphile
.
The only difference I'm seeing here is that numbers (digits) are being split into underscore seperated values while I think we kep numbers together. We could provider a config to enable splitting numbers. iphone6E
does not look like a corner case to me it follows the same pattern that V1
does in the previous getVersionV10..
example you provided.
Lots of major fixes and updates since this thread was last commented on most of these bugs have been fixed also we now generate a schema file schema.graphql
automatically from the db in dev mode and can use it in production mode to prevent any discovery queries in prod and make GraphJin a much nicer option for serverless environments like lambda functions
Summary:
I stumbled on this project via https://www.percona.com/resources/videos/automagical-graphql-sql-compiler and I have a use-case which would suit perfectly with this project. However, this might need the addition of a feature.
The current problem statement is to accept a graphql request via AWS Lambda (or any serverless framework), spin up the lambda, convert the graphql to plain sql, run the query, fetch/insert the result, return the reponse. I use Postgresql.
I am currently researching and have almost finalised doing this via the
Postgraphile
library in Javascript, making use of its--write-cache
CLI option to write the schema and relationship data discovered in a storage medium (or package it in the Lambda), and then use the--read-schema
CLI option to fetch this cached data, validate the graphql query against it and return the result to caller. (I am not using CLIs anymore, but the functions that are being called by the CLI, but that is just an implementation detail for now).What would you like to be added: Option/method to save the discovered data to a file or some cache.
Why is this needed:
I would like to use
graphjin
to do something as explained above, however, I am unable to find any documentation to save the discovered data. In serverless framework, one does not need to discover this data for every API call that is made, and hence, this will be a very useful feature, and will allow for efficiency gain for use-cases like mine, performance, and better adoption of the library in general.