dosco / graphjin

GraphJin - Build NodeJS / GO APIs in 5 minutes not weeks
https://graphjin.com
Apache License 2.0
2.92k stars 179 forks source link

Full business logic in JavaScript #279

Closed slava-vishnyakov closed 1 year ago

slava-vishnyakov commented 3 years ago

Awesome project, thank you Vikram!

I'm thinking about something like a "forgot password" functionality. I don't think it's quite possible yet to implement it fully in GraphJin yet, but maybe with a few small changes it could be done (and it opens a lot of other possibilities too). (Please correct me if this is all already possible).

Basically, a user requests something like this:

mutation forgotPassword @script(name: "forgotPassword.js") (email: "pied@pip.er") {  ​
​  ok
  ​message
  requestId
}

where forgotPassword.js could be something like this:

function request(vars) {
  ​// validate the request
  ​if(!vars.email.includes(/@/)) { 
    // return an error instead of processing
    return {ok: false, message: 'email does not seem correct');
    // or throw '...'?
  }

  // check if user exists
  ​const user = graphql('query { user(email: $email) { id } }', {email: vars.email});
  if(user.id === null) { // or user === null?
    return ....;
  }

 ​ // save the request
  ​const data = { email: vars.email, code: generateRandomString(), time: new Date(), attempts: 0 };

  // graphql() function runs a GraphQL request on our database
  ​const graphqlResult = graphql('mutation { forgot_password_requests (insert: $data) { id } }', {data});

  ​// send an email
  const html = `You have requested...`
  ​http.post('mailgun.com', {html, to}); // send the email to the user

 ​ // return a custom object
 ​ return { ok: true, message: "Sent! Check your email", requestId: graphqlResult.id }; // don't do any postgres queries, just return this immediately
}

There are three things that I think are missing now in the GraphJin, that could make it hugely more useful: 1) ability to run GraphQL queries right from the request(...) 2) ability to immediately return results instead of GraphJin doing anything 3) ability to throw a custom error message, creating a more detailed validation

I think this opens huge possibilities - you could have a whole additional API without running node.js or something. Full custom business logic right in GraphJin

Then of course you'd need something that validates the code after it was sent, so it could be something like

mutation resetPassword @script(name: 'resetPassword.js') (requestId: "....", codeFromEmail: "123456abcdef", newPassword: "s3cr3t") {
  ok
}

and resetPassword.js:

function request(vars) {
  const data = graphql('query forgot_password_requests(id: $id) { code, attempts }', {id: vars.requestId};

  // TODO: check if not expired

  if(data.attempts > 3) {
    // some more logic to rate limit requests.. 
    return {ok: false, message: 'Too many attempts'};
  }

  if(data.code != vars.code) {
    // mutation to increment attempts
    return {ok: false, message: 'The code was not correct'};
  }

  return {ok: true}
}

I don't know - maybe instead of return {ok: false, message: "xyz error"} it would be better to throw an exception, but I think you get the idea.

This might actually also help with the problem that YugaByteDB guys told about - where a complex query like query getHome from older mobile clients could be intercepted by the backend via a custom request function, be processed in a more efficient manner (by running graphql() calls inside) and return the data with the same schema, but using a different way to run it.

Thank you!

slava-vishnyakov commented 3 years ago

To expand the idea a little bit, if we could implement this, something like this could be possible too:

// mask the email in results if authenticated user id is not 1 (admin)
function request(vars, request) {

  let result = graphql(request.query, request.bindings);
  // `request.query` is the incoming GraphQL query with $foo params intact
  // `request.bindings` is any of $foo substitutions, so we can send the request to graphql ourselves

  // request.userId contains the id from auth module
  if(request.userId !== 1) { 
    // ok, maybe a better way would have been to graphql() the user 
    // and see if user is is_admin or something, but just for brevity
    result.email = 'sorry, you cannot see this, only admins can';
  }

  return result;
}

which would be the equivalent of

function request(vars) {
  ...
}

function response(json) {
  // if userid != 1 -- not implemented and cannot check the databsae
  json.email = 'sorry, you cannot see this, only admins can';
}

but we don't really need response(...) functions anymore, because we could do everything in request(...).

slava-vishnyakov commented 3 years ago

Backwards compatibility issue: if this is implemented, it changes request from 1 parameter to 2, so older functions will break, so maybe it's better to use some new name and maybe an request object, like this:

Sample query:

query GetUsers(user_name: $name, first: 30) {
  user {
    id
  }
}

{name: "Jim"}
function process(req) {
  // req.type 
  // 'query' or 'mutation'

  // req.vars(?) has query/mutation params
  // {user_name: "Jim", first: 30}

  // req.query has the text of query with $xyz intact
  // `query GetUsers(name: $name) {\n  user....'

  // req.bindings has {xyz: 123} data
  // {name: "Jim"}

  // then we could easily do:
  let result = graphql(req.query, req.bindings);

  // req.schema has something like an object that has incoming graphql query parsed 
  // so that we can figure out which fields the user expects without manually parsing the req.query string
  // ? maybe {"user": ["id"]}, not really sure

  // req.userId = user ID from the auth module
  // 2 or null

  // req.headers - http headers of the request
  // {'authorization': 'Bearer 12345', 'user-agent': 'Netscape Navigator 5.0'}

  // req.remoteIp - IP of the remote server, this would allow some custom IP-based rate limits, like per hour or so
  // '8.8.8.8'

  return result
}

But in this case we definitely need to change the function name from request(), since the only argument totally changes

slava-vishnyakov commented 3 years ago

A few more advanced thoughts:

1) Being able to do async function process() with a possibility to await http.get could be pretty powerful, since right now I assume http.get is only synchronous and blocks the process, so you can't for example call multiple HTTP services concurrently and join their results.

2) Another thing that's missing here is the ability to subscribe/@stream? the results, which could have been really cool, so something like

async function process(req, emit) {
  for await (result of graphqlSubscribe('subscribe/@stream? query here')) {
    result.foo = 'bar';
    emit(result);
  }
}

Now that would have been really powerful, since we could call some other services too (via await http.get) and stream their results.

But that's something more advanced, though generally modern JS devs would assume you could both do a synchronous function (just like now) and async functions for more advanced stuff.

slava-vishnyakov commented 3 years ago

This also opens up possibilities to do some advanced validations - like whether this user could really "connect" that thing or not (I'm not sure if that's possible now with GraphJin, but with request.schema we could graphql(...) the data and verify that indeed the user is allowed to do this or throw 'not allowed' instead.

slava-vishnyakov commented 3 years ago

Caveat: during the development GraphJin should not add queries from graphql(...) function calls inside function process(...) to the white-list, because these are internal calls and in production mode GraphJin should not validate calls from graphql(...) function against the whitelist (since, again, these are internal, trusted calls).

Also, possibility to do transactions from JS could be really cool (so that graphql calls are in a transaction).

dosco commented 3 years ago

I really like this and would be open to exploring this further. When I added the @script tag I did think of taking it further with custom resolvers. I added some more relevant details to your other issue on the topic https://github.com/dosco/graphjin/issues/280

As for the Caveat you can disable the allow list when using GraphJin core as a library DisableAllowList: true.

slava-vishnyakov commented 3 years ago

As for the Caveat you can disable the allow list when using GraphJin core as a library DisableAllowList: true.

I don't think it's possible, since you need to set it to both true and false at the same time :)

Here's what I mean. Let's say, that I have some regular getPosts GraphQL query in the allow list, then also I have a custom JS getCustomUsers and in order to make that query I need to do two GraphQL queries inside the JS, like maybe:

query getCustomUsers {
  email
}

getCustomUsers.js:

function process(req) {
  const oldUsers = graphql(`query getOldUsers oldUsers(limit: 5) { email }`);
  const newUsers = graphql(`query getNewUsers newUsers(limit: 5) { email }`);
  return oldUsers + newUsers;
}

1) I want my customers to be able to run getPosts query and getCustomUsers query 2) I don't want the dev mode to accidentally add getOldUsers and getNewUsers to the allow list, since these are internal queries and should only be run by getCustomUsers - that's what the caveat was about.

So, I want allow list to be enabled, but getOldUsers and getNewUsers will not be in the allow list, since I don't want users to run them. But unless we ignore the allow list for internal JS calls, these graphql calls will be blocked on production and getCustomUsers will fail.

dosco commented 3 years ago

Ok I kinda understand what you're getting at. Can you try the same queries without the name (getOldUsers, getNewUsers) queries without a name are not saved to the allowlist (Not sure if they will work as you want but I'd like to find out). I do see why queries from the JS script should not be accessible publicly outside of the JS script. I'll look into this and see if I can have a fix.

Also unrelated but you can combine the above queries.

function process(req) { const data = graphql(query getUsers { oldUsers(limit: 5) { email } newUsers(limit: 5) { email } }); return data.oldUsers + data.newUsers; }

slava-vishnyakov commented 3 years ago

The example is superfluous, of course. Imagine there's some "if" after the first query that checks some condition, then it makes a second query.

Can you try the same queries without the name

The problem as I see it, is that this approach requires huge discipline. If anyone accidentally copies a named query from GraphQL explorer into an internal call - that instantly becomes whitelisted. Imagine that call is something like "increase the balance" (after checking that the user is entitled to that) or "unban a bad user" (after some ip-based rate limits checks in Redis).

EDIT: Also, I just realized that even if we used non-named queries - that means that as soon as we go to production, they will not work, so all internal functions will go down.

slava-vishnyakov commented 3 years ago

It seems that this isn't even that hard to do:

1) seed.js already has graphql(..., ....) function that exactly matches the function needed for this issue 2) seed.js already allows ANY query without going through the whitelist

So, basically, seed.js already implements at least one of the requirements perfectly!

dosco commented 2 years ago

FYI This has been implemented. https://www.reddit.com/r/golang/comments/nekgou/graphjin_graphql_to_sql_compiler_with_javascript/

slava-vishnyakov commented 2 years ago

@dosco I'm aware of @script, in the original post, I'm using it to show why this is not enough, skip to the paragraph that says "There are three things that I think are missing now in the GraphJin, that could make it hugely more useful".

slava-vishnyakov commented 2 years ago

@script in its current form is not enough for us to start using GraphJin, as explained in the original and following posts.

dosco commented 2 years ago

Sorry it's a long post I think I missed the point let me re-read.

dosco commented 2 years ago

I think the 3 points you made are valid and probably not very hard to implement. Let me look into them. Yes, I agree it would do away with the need for response but probably doesn't hurt to just leave it in.

  1. Ability to run GraphQL queries right from the request(...)
  2. Ability to immediately return results instead of GraphJin doing anything
  3. Ability to throw a custom error message, creating a more detailed validation
  4. Support async/await.

I need to think through 3 a bit more what would be the point of the selector if you are only using the graphql query to execute a script not sure if this is even valid graphql?

mutation resetPassword @script(name: 'resetPassword.js') (requestId: "....", codeFromEmail: "123456abcdef", newPassword: "s3cr3t") {
  ok
}

Also I did want to implement @stream and @defer but I don't think any client supports it at this time. It would also not be too hard to implement we have a lot of the infra. pieces in place.

slava-vishnyakov commented 2 years ago

Ability to throw a custom error message, creating a more detailed validation what would be the point of the selector if you are only using the graphql query

Well, the point would be that in absense of a way to do it in GraphJin, we need to do this in some other backend thing (node.js/Python/Ruby/PHP, etc...) and if we have that backend, then we have no need in GraphJin, because otherwise we'll have a wild mix of node/graphjin that would be very hard to test.

That's why we don't use GraphJin now - it doesn't allow us to move all the logic to scripts and use it as a backend and since we can't do it, then we need another backend and if we have another backend - it doesn't makes sense to have 2 backends and therefore GraphJin is out.

dosco commented 2 years ago

Fair enough we could probably add the ability to run scripts from our actions (we allow you to create http endpoints called actions in GraphJin) right now they can only run SQL queries but we can extend those to run scripts.

As for running two backends I know several users who have two backends once for custom scripts others for crud queries via graphjin it still saves them months of time that they would spend writing all those apis instead of just the special ones.

slava-vishnyakov commented 2 years ago

I'm not saying that GraphJin isn't useful, it is. But for the sites that we manage, we need to do a lot more than CRUD (like autorization - checking if user is banned, has permissions, etc... checking rate-limits, logging). If we do it in node.js or smth like that, it means that we only replace SQL calls with GraphJin calls in our node.js code - that's not much of time saving here, since both SQL and GraphQL are about the same size (unless you use deep joins, which your shouldn't in big databases), but at the same time we have to deploy and monitor one more thing (GraphJin).

dosco commented 1 year ago

graphjin now works as a nodejs app so this usecase can be implemented in javascript npm i graphjin

slava-vishnyakov commented 1 year ago

@dosco Now, that is awesome! Kudos!

dosco commented 1 year ago

that's not much of time saving here, since both SQL and GraphQL are about the same size (unless you use deep joins, which your shouldn't in big databases), but at the same time we have to deploy and monitor one more thing (GraphJin).

Using joins for even large databases is fine most of these are web queries so already limited to 5 or 10 results and then we use cross lateral join to pull in other related data (eg. blog post + author + like authors etc) else you'd have to munge this data together in the app code. additionally you also get the ability to do recursive queries (eg. comment + reply + reply_to_reply threads), built in efficient pagination using a cursor, remote joins with (eg. pull in stripe payment info for a user)

At scale as your apis grow in number it really saves a ton of time and you always get the most efficient query no sudden n+1 somewhere cause that one dev was not familiar with joins. Also most the words databases are internal company / org ones at non-tech companies where talent is a bigger challenge.

A centralized config file allows you to set rules in one place around blocking certains columns for certain roles or entirely, adding certain filters to certain tables and these rules will apply across all your apis. Addiitonally since .gql files are portable even across languages.

pedrosimao commented 1 year ago

@dosco Is it possible to add custom business logic directly in the GraphJin Standalone Service? The idea is to have all the mutations and queries available, but be able to intercept them and add logic before or after completion.

dosco commented 1 year ago

You can use OptionSetHookFunc in serv to set a hook function thats called on every request. You could extend that or add another similar hook if you want to do more.