ngs-doo / dsl-json

High performance JVM JSON library
https://dsl-platform.com
BSD 3-Clause "New" or "Revised" License
992 stars 105 forks source link

JsonValue syntax tree #265

Open andyglow opened 8 months ago

andyglow commented 8 months ago

Hi! I'm looking on how to get json (or it's part) parsed into a valid Json Value. Either it is an json object or json array or any other valid type.

I know today I can refer an Object type and parser will do the job by sending me a combination of lists and maps and primitives of all kinds. But what am I looking is rather something similar to circe Json or jsoniter Any or jackson JsonNode, something that would represent a json language.

Why do I need it is because part of request that I parse has fuzzy structure and I don't know in advance what type of value is there. I can use Object, as I noticed earlier, but I don't want to keep it as abstract as Object is.

Can you suggest something?

PS: I also understand that I potentially can create my own hierarchy of classes or something and there is @JsonConverter that I could use to write my own parser, but I was wondering if there something similar that already exists.

Thank you

zapov commented 8 months ago

I'm not aware of any such things. I dont think there is any significant difference between testing for specific types or checking JsonNode type.

andyglow commented 8 months ago

well some times your data is fuzzy. think about bigdata processing, for instance, of an ingestion pattern your code should be able to take any json, but some times you know, that, ok, I don't care about the rest of the payload but i know there is a blah-blah field that I use to check, or i know there is a field called events carrying an array of jsons and I don't care about it's structure, but I need to somehow pass them downstream after some checks

andyglow commented 8 months ago

just added a little PR demonstrating the feature: https://github.com/ngs-doo/dsl-json/pull/267

zapov commented 8 months ago

I'm sure it might be useful to some people. What I meant is that I don't think there is significant difference with code doing

if (nodeType == JsonNode.Object) {
 // then this
} else if (...

And

if (value instanceof Map) {
  // then this
} else if { ...

Or even your PR where you have

if (value instanceof JsonObject) {
  // then this
} else if ( ....

Anyway... I don't personally need that... but if many people do find that useful I don't mind merging PRs which setup some utilities to deal with JSON in such a way.

Also, to keep in mind... "sparse" processing of JSON is already supported. If you are interested only in a subset of fields (from a known structure) you can just deserialize JSON into this smaller structure. Deserialization will ignore all the "unknown" fields by default. This does cover many similar use cases when people resort to such "low-level" inspection of the incoming structure.

andyglow commented 8 months ago

yes, I know, ADT is not supposed to be a general approach in this case, but it might be helpful in some scenarios.

buy the way, separate from this discussion, is there a functionality that would help me capturing a slice of json without really parsing it? say, we have some json with rich nesting:

{
  "userId": "string",
  "userEmail": "string",
  "userAge": 32,
  "metadata": {
    .. some mutli-level struct ..
  }
}

and I need to capture "userId" alongside with "metadata", but I don't really care about "metadata" type. And I don't want to really parse it at this phase. I would be happy to be able to extract it as ByteBuffer or String . The only thing is that I need to check that it's a well-formed json.

so eventually I need something like this

public record UserExtraction(userId: String, metadata: String /* or ByteBuffer or ChatBuffer or byte[] or char[]*/)

What would you recommend here? Thank you

andyglow commented 8 months ago

replying to your comment, I would say, and yes and no and the concern is api's type-unsafeness that we expose in here I can as to parse into Object and I will receive Map<String, Object> or similar right? Then somebody would as himself, is that map mutable? is it hashed or linked or.. and nested stuff, that list, is it liner or array or whatever. So we create a precedent for user to get into a nasty code.

And don't take it as a criticism, I know a bunch of people who would accept this style of code, whereas others would prefer some more accurate and safe style where compiler is able to prevent accidental mistakes.

zapov commented 7 months ago

Something like this should work

public class Metadata {}
public record UserExtraction(userId: String, metadata: Metadata)

You must respect the JSON structure this way. Otherwise you would need custom converter to process metadata in a special way (eg, skip over the object/array/string).

Of course its always better to make API feel safe and expected, as long as you are not compromising other goals. Anyway, DSL-JSON does not have this internal structures to speed things up.

If someone wants to introduce something along those lines during or after the parsing I have no problems with that. I have no need for such API, I always try to deal with classes and known data structures and if some specific objects is strange, then use converter for that specific part of JSON. To me that is superior approach than trying to get "raw json objects" and deal with them via ADT or type matching or alike.