aPureBase / KGraphQL

Pure Kotlin GraphQL implementation
https://kgraphql.io
MIT License
298 stars 58 forks source link

Lexer parses only alphabetical characters #72

Open noep opened 4 years ago

noep commented 4 years ago

Some specific cases. I use not only alphabetical characters but also Koreans. but lexers don't parse any letters without alphabetical characters.

let me show some codes, l make some schema with korean characters.

data class 한글(val 번호:String, val 내용:String)
query("한글") {
    resolver { 번호: String? ->
    한글("1234", "안녕하세요")
    }
  }

and I make some http request to use graqhQL request

POST /client/graphql? HTTP/1.1
Host: localhost:8080

{
  "query": "{ 한글 {번호, 내용}}"
  ,"operationName": null
  ,"variables": null
}

and Errors response

{
    "errors": [
        {
            "message":"Syntax Error: Cannot parse the unexpected character \"\한\".",
            "errorType": "GraphQLError"
        }
    ]
}

It seems like lexers don't parse any letters without alphabetical characters. com.apurebase.kgraphql.request.Lexer line 129 and line 130

private fun readToken(prev: Token): Token {
return when (val code = body[pos].toInt()) {
   ... codes
// A-Z _ a-z
            in (65..90), 95, in (97..122) -> readName(pos, col, prev)
   ... codes
}
}

And line 216 and line 217

private fun readName(start: Int, col: Int, prev: Token): Token {
... codes
while (code != null && (code == 95 || // -
                    (code in 48..57) || // 0-9
                    (code in 65..90) || // A-Z
                    (code in 97..122)   // a-z
                    )
... codes

it is easy to support just only korean characters with add some branch codes. but will occur 2 problems

  1. the Lexer seems porting of javascript code, is it right way to customize the Kotlin Lexer?
  2. that patch seems to temporary. more approach needs to support multilingual languages
jeggy commented 4 years ago

It could be that we should add support for this via a an option that you could set within your configuration of your schema.

KGraphql.schema {
  configure {
    // Enabling this would change to a more flexible parser.
    fullUnicodeNamesSupport = true
  }
}

The previous parser had some small issues here and there, so I decided to redo the whole parser implementation and choose to just port over the JavaScript one. As it's written by Facebook and should have been very well tested.

In the spec - 2.1.9 - Names it says it should only support ASCII. But I'm fully open for having support for a configuration to support more if you enable it.