quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.78k stars 2.68k forks source link

Unquoted string return : not a valid json format #39761

Open louislouislouislouis opened 7 months ago

louislouislouislouis commented 7 months ago

Describe the bug

This little snippet, using the extension quarkus-resteasy-reactive-jackson

@GET
@Path("/test")
@Produces({"application/json"})
public Uni<String> fetchAttribute2(){
  return Uni.createFrom().item("string");
} 

or this one simpler

@GET
@Path("/test")
@Produces({"application/json"})
public String test() {
  return "string";
}

does not have the good behavior. The returned string is unquoted, whereas the headers indicate Content-Type:application/json;charset=UTF-8 An unquoted string is not a valid json, the returned string should be surrounded with quot

Expected behavior

This http response when fetched should be

"string"

with headers content-type: application/json

Actual behavior

This http response when fetched is actually

string

with headers content-type: application/json, but unquoted string are not valid json

How to Reproduce?

This little snippet, using the extension quarkus-resteasy-reactive-jackson

@GET
@Path("/test")
@Produces({"application/json"})
public Uni<String> fetchAttribute2(){
  return Uni.createFrom().item("string");
} 

or this one simpler

@GET
@Path("/test")
@Produces({"application/json"})
public String test() {
  return "string";
}

Output of uname -a or ver

22.04.1-Ubuntu x86_64 GNU/Linux

Output of java -version

openjdk 20 2023-03-21

Quarkus version or git rev

3.9.0

Build tool (ie. output of mvnw --version or gradlew --version)

Gradle 8.6

Additional information

This seems to came from this line. With the comment // YUK: done in order to avoid adding extra quotes... But unquoted string are not valid json.

quarkus-bot[bot] commented 7 months ago

/cc @FroMage (resteasy-reactive), @stuartwdouglas (resteasy-reactive)

FroMage commented 7 months ago

This is a design choice we made long ago, to allow users to return already-formatted JSON as String values.

For example, you can write:

@GET
@Path("/test")
@Produces({"application/json"})
public String test() {
  return "{\"foo\": true}";
}

And we won't quote it. This is useful in many cases. The problem is that in Java there's no distinction between String the JSON type (which needs to be quoted) and String which is already-valid JSON.

We could, I suppose add a JSONString type that would be auto-quoted by our JSON marshallers, in line with the existing JSONArray and JSONObject, but it would be of very little value since most JavaScript endpoints return objects or arrays, almost never the other JSON types. Also, note that for the same reason, we don't marshall int and other numerical values as JSON from endpoints.

I'm not sure of the value of adding a JSONString type given that it's relatively easy to add quotes from the endpoint before returning the value.

jekkel commented 7 months ago

Although I can follow the reasoning, still for some use cases this design decision makes things a little bit ugly. Imagine an endpoint which accepts something like a JSON path as URL path suffix (e.g. /documents/{id}/path/to/value) which gets used to extract the corresponding value from a schemaless document. The code can't know in advance to which type the path resolves for a given document. Am I right that in such use cases it's expected to basically do one of 2 things:

  1. test wether the extraction results in a String value and quote it before returning
  2. or always marshall all values circumventing framework support for marshalling.

But how would one then realize content negotiation? A common use case for us is to support application/cbor beside application/json. If I am not mistaken, one would need to explicitly react on the chosen content type to drive marshaling decisions in the code. This seems like a framework's responsibility bleeding into application code.

Maybe I am missing something here?

FroMage commented 7 months ago

The code can't know in advance to which type the path resolves for a given document

We've never heard of such a use-case before, that I can recall. This requires that your return type is Object as that's the only common superclass of possible JSON values. I'm not sure how this would work wrt. marshalling. But it's an interesting use-case.

As for content negociation, how do you support application/cbor at the moment? We've been thinking about adding support for it, but I don't think we did.

rnza0u commented 7 months ago

This thread on SO is from me:

https://stackoverflow.com/questions/78238888/quarkus-is-unable-to-serialize-a-simple-string-as-json

It already has got 2 downvotes. Quite funny.

If this is a "design choice made a long time ago", then that means it should be documented. Especially since it is an implicit exception to the serialization mecanism of Quarkus.

jekkel commented 7 months ago

As for content negociation, how do you support application/cbor at the moment?

TBH we don't use it yet in the quarkus application. But it's in use in a Spring application and it's certainly a thing for us.

This requires that your return type is Object as that's the only common superclass of possible JSON values

That's correct.

But it's an interesting use-case.

Our real-world use case is actually even simpler, without any JSON path, but with an "any" schema for values of a map of attributes of a resource. These attributes are allowed to have any JSON compatible value, complex or primitive.

Regarding the proposal of adding JSONString: From my perspective the other way around would be clearer. Introduce something like JSONValue which can wrap a String and inhibits encoding. Whereas everything else gets passed to the marshaller.

Bottom line: now that we know about the details we can work our way around it (at least for now) but I'd strongly encourage to rethink the design decision for the sake of consistency and compatibility. Feel free to close the ticket if you plan to stick with it.

Thanks for the fast response by the way and kudos for quarkus in general! :rocket:

rnza0u commented 7 months ago

but it would be of very little value since most JavaScript endpoints return objects or arrays, almost never the other JSON types. Also, note that for the same reason, we don't marshall int and other numerical values as JSON from endpoints.

It is rare, but it makes sense in a lot of cases.

I'm not sure of the value of adding a JSONString type given that it's relatively easy to add quotes from the endpoint before returning the value.

JSON strings aren't just made by adding double quotes. Encoding must be performed, for inner double quotes, for line breaks and for unicode characters.

I don't think that it is a good idea to stop serializing strings. It is likely to be unexpected because it is an implicit behavior. All values should be marshalled by default unless we would add something like another annotation or return a special type. That would be more explicit and make things more clear.

FroMage commented 7 months ago

I was looking for issues or commits which explained why we did this, because this is 5 years ago, my memory is not reliable, and I could not find the issues where we discussed this. I do remember a discussion, just, not where.

The commit which introduced the "treat json-producing strings as pre-encoded" is a move commit, so that's rather unfortunate, because the behaviour is not explained: 971ee4648488de01925a74258cf057b4d865e434 (and below)

@stuartwdouglas or @geoand do you remember why we went that way? We could have made any JSON marshaller quote String and other JSON value types. There must have been a reason. Perhaps the TCK?

Before we make any changes, we should figure out why it is the way it is.

geoand commented 7 months ago

Perhaps the TCK?

I think this was the reason

jdussouillez commented 3 months ago

I'm facing the same problem today.

The problem is that in Java there's no distinction between String the JSON type (which needs to be quoted) and String which is already-valid JSON.

I agree with @Hakhenaton:

All values should be marshalled by default unless we would add something like another annotation or return a special type.

It would be nice to serialize the value by default except if there is something like a @JsonRaw annotation on the endpoint.

Inspired by Jackson's JsonRawValue annotation

This would be nicer than the current workaround:

@Inject
ObjectMapper jsonMapper;

@GET
@Path("/{id}")
@Produces(MediaType.APPLICATION_JSON)
public Uni<String> get(final String id) {
    return service.get(id)
        .map(url -> {
            try {
                return jsonMapper.writeValueAsString(url)
            } catch (JsonProcessingException ex) {
                // ...
            }
        });
}
geoand commented 3 months ago

It would be nice to serialize the value by default except if there is something like a @JsonRaw annotation on the endpoint.

That sounds pretty reasonable

FroMage commented 3 months ago

Note that the same applies to numbers (which may have a different representation in JSON than how we format them) and boolean and null (though in those cases I suspect the value/format is compatible.

One way to try this out is to make the switch and run CI and see how many things break.

If anyone wants to make a PR, we can help :)

helpmessage commented 4 weeks ago

I found out a neat trick for those who want the quoted String in return. Server:

@GET @Path("/test") @Produces({"application/json"}) public Optional<String> test() { return Optional.of("string"); }

and client:

@GET @Path("/test") @Produces({"application/json"}) String test();

On the client, the String received is quoted.

Not sure if this is the excepted behavior or if someone could use this...

FroMage commented 4 weeks ago

Ouch, no that is not expected 😅