cbeust / klaxon

A JSON parser for Kotlin
Apache License 2.0
1.86k stars 121 forks source link

Expose intermediate JSON AST #280

Open sghpjuikit opened 5 years ago

sghpjuikit commented 5 years ago

Problem: Currently JSON AST (abstract syntaxt tree, i.e. abstract json representation) is not exposed when converting objects to json. This is a design issue inherently leading to lack of flexibility and capability. An example: no support for pretty printing json of an object.

Related issues/concepts, which would benefit from this:

sghpjuikit commented 5 years ago

This is a breaking change. Considering the 1200 stars accompanying this project, it is something to consider. On the other hand AST is (in my opinion) quite integral here so it may be absolutely worth it, especially in the long run.

Backward compatibility This involves changing the Converter interface, which would completely break all Converter implementations out there. Thus, we would have to provide a convenience Converter that users would switch to and which would work as Converters do right now and which would basically reread its output back into AST and then return it to conform to the new API. Not exactly ideal, but better than breaking code and a big red warning telling users to fix their code

Performance This may also change the performance dramatically, which is something which should be evaluated (i.e. performance test is more than a good idea).

Reference https://github.com/spray/spray-json

PR As for PR, give me few days to check the scope of the issue and to decide whether I could work on this.

sghpjuikit commented 5 years ago

I cloned the repo and had a look. Unfortunately, this would be a lot of work, I'm not sure about if I can make full PR.

Backward compatibility could be made a non-issue if we actually let the converters as are and merely make them extend a new, more generic kind that works with the AST.

In order to understand the json AST, I made my own working AST prototype and integrated it with Klaxon in order to leverage its parsing state machine. Has exactly 250 lines of code. Including basic error handling, printing and all conversions. I will finish the prototype and provide full code, then look into whether it could be joined somehow into the library. Prototype details below:

Supports

This also

There are couple possible improvements:

AST uses sealed classes and looks like:

sealed class JsValue
object JsNull: JsValue()
object JsTrue: JsValue()
object JsFalse: JsValue()
class JsString(val value: String): JsValue()
class JsNumber(val value: Number): JsValue()
class JsArray(val value: List<JsValue>): JsValue(), JsRoot
class JsObject(val value: Map<String, JsValue>): JsValue(), JsRoot

The converter would look +- like this:

interface ToJsonConverter<T> {
   fun canConvert(value: T): Boolean = true
   fun toJson(value: T): JsValue
   fun fromJson(value: JsValue): T?
}

An implementation of (performance unoptimized) pretty print using this AST is:

fun JsValue.prettyPrint(indent: String = "  ", newline: String = "\n"): String {
   fun String.jsonEscape() = replace("\\", "\\\\").replace("\"", "\\\"")
   fun String.toJsonString() = "\"${this.jsonEscape()}\""
   fun String.reIndent() = replace(newline, newline + indent)
   return when (this) {
      is JsNull -> "null"
      is JsTrue -> "true"
      is JsFalse -> "false"
      is JsString -> value.toJsonString()
      is JsNumber -> value.toString()
      is JsArray ->
         if (value.isEmpty()) "[]"
         else "[" + newline + indent + value.joinToString(",$newline") { it.prettyPrint() }.reIndent() +  newline + "]"
      is JsObject ->
         if (value.isEmpty()) "{}"
         else "{" + newline + indent + value.entries.joinToString(",$newline") { it.key.toJsonString() + ": " + it.value.prettyPrint() }.reIndent() + newline + "}"
   }
}

Creating custom converters is now absolute piece of cake, I'm not even using the DSL yet:

// uuid
object: ToJsonConverter<UUID> {
   override fun toJson(value: UUID) = JsString(value.toString())
   override fun fromJson(value: JsValue) = value.asJsString()?.let { UUID.fromString(it) }
}
// polymorphic hierarchy +  heterogeneous json structures based on type
         object: ToJsonConverter<PropVal> {
            override fun toJson(value: PropVal): JsValue = when (value) {
               is PropVal.PropVal1 -> JsString(value.value)
               is PropVal.PropValN -> JsArray(value.value.map { JsString(it) })
               else -> fail { "Unexpected value=$value, which is not ${PropVal::class}" }
            }
            override fun fromJson(value: JsValue) = when (value) {
               is JsString -> value.value.let { PropVal.PropVal1(it) }
               is JsArray -> value.value.mapNotNull { it.asJsString() }.let { PropVal.PropValN(it) }
               else -> fail { "Unexpected value=$value, which is not ${JsString::class} or ${JsArray::class}" }
            }
         }

Understand Klaxon's 'AST' using:

fun fromKlaxonAST(ast: Any?): JsValue {
   return when (ast) {
      null -> JsNull
      true -> JsTrue
      false -> JsFalse
      is String -> JsString(ast)
      is Number -> JsNumber(ast)
      is JsonObject -> JsObject(ast.map.mapValues { fromKlaxonAST(it.value) })
      is JsonArray<*> -> JsArray(ast.map { fromKlaxonAST(it) })
      is JsonValue -> null
         ?: ast.array?.let { fromKlaxonAST(it) }
         ?: ast.obj?.let { fromKlaxonAST(it) }
         ?: ast.boolean?.let { fromKlaxonAST(it) }
         ?: ast.boolean?.let { fromKlaxonAST(it) }
         ?: ast.char?.let { JsString(it.toString()) }
         ?: ast.double?.let { JsNumber(it) }
         ?: ast.float?.let { JsNumber(it) }
         ?: ast.int?.let { JsNumber(it) }
         ?: ast.longValue?.let { JsNumber(it) }
         ?: ast.string?.let { JsString(it) }
         ?: JsNull
      else -> fail { "Unrecognized klaxon AST representation=$ast" }
   }
}