argonaut-io / argonaut

Purely functional JSON parser and library in scala.
Other
547 stars 112 forks source link

Strictly decode to case class disallowing extraneous fields in JSON #104

Open LeifW opened 10 years ago

LeifW commented 10 years ago

So that

case class Person(name: String)
Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""")

will yield something about "Error: unexpected field "foo"".

Is this feasible?

Example use case: When people call our API, it would be helpful to tell them if the misspelled a key or added junk that doesn't do anything - tell them that's not key that means anything to us and it'll be ignored / rejected as invalid.

markhibberd commented 10 years ago

@LeifW You can use validate to do that: https://github.com/argonaut-io/argonaut/blob/master/src/main/scala/argonaut/DecodeJson.scala#L55-L65

markhibberd commented 10 years ago

I will leave this open, as a reminder to add an example in the docs. Let me know if this doesn't address what you need.

From your example:

  implicit def PersonDecodeJson: DecodeJson[Person] = 
   casecodec1(Person.apply, Person.unapply).validate(....)

Also let me know if there is a general formulation that would let you not specify the validation function explicitly. It is worth noting that it used to be the default, but it was changed because some found it not desirable - but also because it generated fairly poor error messages - if we could address the poor errors part, I would be happy to add some more combinators to make it easy to validate without specifying a custom error message.

lukasz-golebiewski commented 8 years ago

@markhibberd The URL you pasted is dead. Do you have any examples on how to use validate?

kevinmeredith commented 7 years ago

To make sure that I understand this problem, @LeifW, let's please consider an example.

Given:

import argonaut._, Argonaut._

case class Person(name: String)

implicit def decode: DecodeJson[Person] =
  DecodeJson ( c => 
    for {
      name <- (c --\ "name").as[String]
    } yield Person(name)
  )

scala> Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""")
res5: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

you would expect, either by default in Parse#decode or an optional Parse#decodeExactly, that the extra foo field should result in a Left rather than Right?

LeifW commented 7 years ago

Exactly, yes, thanks for the example. I guess I'd lean towards the optional decodeExactly; I think people are used to the present behavior. A use case I was thinking of was better error messages. For large complex structures it can be easy to misspell a field, and the "missing field foo" would be easier to figure out when accompanied by a "unexpected field foob". Also sometimes people pass extra fields in their JSON API calls, and then they just get cargo culted in successive generations, even though they never did anything - people are afraid to remove them for fear of breaking something. I'd like to head that kind of thing off explicitly at the beginning.

On Apr 2, 2017 10:35 AM, "Kevin Meredith" notifications@github.com wrote:

To make sure that I understand this problem, @LeifW https://github.com/LeifW, let's consider an example.

Given:

import argonaut., Argonaut.

case class Person(name: String)

implicit def decode: DecodeJson[Person] = DecodeJson ( c => for { name <- (c --\ "name").as[String] } yield Person(name) )

scala> Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""") res5: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

you would expect, either by default in Parse#decode or an optional Parse#decodeExactly, that the extra foo field should result in a Left rather than Right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/argonaut-io/argonaut/issues/104#issuecomment-291001334, or mute the thread https://github.com/notifications/unsubscribe-auth/AABwJyntLvpVufcOEZ05MRnMO0f0xLsrks5rr9xIgaJpZM4B8veS .

seanparsons commented 7 years ago

DecodeJson.validate should be able to achieve that (albeit at a cost to the caller). It does chime slightly with an idea I've had in the past for a different way to handle parsing completely but behaviour like this would sit weirdly in our current model for decoding.

kevinmeredith commented 7 years ago

Thanks, @seanparsons, for pointing me to that method!

Locally, I added the following:

case Some(w) => println(a); DecodeResult.ok(w)

to this line.

Then, I re-ran the above example via sbt console:

JSON contains exact # of fields, one, as case class to which it's decoded

scala> Parse.decode[Person]("""{"name": "Bob"}""")
HCursor(CObject(CJson({"name":"Bob"}),false,object[(name,"Bob")],(name,"Bob")),CursorHistory(List(El(CursorOpDownField(name),true))))
res0: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

JSON has one extra field, i.e. not in the Person

scala> Parse.decode[Person]("""{"name": "Bob", "foo": "dunno"}""")
HCursor(CObject(CJson({"name":"Bob","foo":"dunno"}),false,object[(name,"Bob"),(foo,"dunno")],(name,"Bob")),CursorHistory(List(El(CursorOpDownField(name),true))))
res1: Either[Either[String,(String, argonaut.CursorHistory)],Person] = Right(Person(Bob))

After decoding, i.e. JSON => Person, is it possible to figure out if the *Cursor has any extra, i.e. not walked, fields?

I looked at the scaladocs for HCursor and ACursor, but I'm not sure. Please help me out?

seanparsons commented 7 years ago

@kevinmeredith Not walked? I think you'd be better off just checking that fieldSet matches what you expect.

kevinmeredith commented 7 years ago

matches what you expect

Could you please say more?

Taking your reply, @seanparsons, I posted this StackOverflow question - http://stackoverflow.com/questions/43433054/retrieving-traversed-json-fields.

I appreciate any help, please.

seanparsons commented 7 years ago

@kevinmeredith This is what you want to check against correct?

scala> json.right.get.hcursor.fieldSet
res1: Option[Set[argonaut.Json.JsonField]] = Some(Set(foo))
kevinmeredith commented 7 years ago

@seanparsons Given:

case class Person(name: String)

and:

scala> Parse.parse("""{"name": "Bob", "foo": "dunno"}""").toEither.right.get.hcursor.fieldSet
res6: Option[Set[argonaut.Json.JsonField]] = Some(Set(name, foo))

How would you use fieldSet to determine if there's any extraneous/extra fields in the input JSON? As far as I understand, there's no getFields on a case class, but please correct me if there is.

Also, my original thought was add an def decodeExactly function that did:

  1. Get list of fields not walked/traversed - http://stackoverflow.com/questions/43433054/retrieving-traversed-json-fields
  2. Decode JSON to case class
  3. if step 2 succeeds, check if 1 is empty. If it's non-empty, then return an error that there's extra fields

Thanks for your continued help.

seanparsons commented 7 years ago

Well surely the fieldSet you want is Some(Set("name")) right?

Unless I'm mistaken that means in the DecodeJson.validate call the function HCursor => Boolean would look something like _.fieldSet == Some(Set("name")).

To support a decodeExactly we'd need to do a non-trivial amount of changes because of the way the existing code is written.

kevinmeredith commented 7 years ago

Given all of the fields of a case class A and all of the Json's keys, assuming we decode the Json to an A, we can compare A's fields to Json's keys to implement this decodeExactly behavior, no?

https://stackoverflow.com/a/27445097/409976 shows how, using shapeless, we can get all of a case classes' fields w/ LabelledGeneric.

With respect to the second problem, i.e. getting all keys from a Json, I'm not sure if such a method exists. I don't believe that Cursor#fields would work, as is:

scala> Json.obj(
     "a" -> jString(""), 
     "b" -> jBool(false), 
     "c" -> Json.obj("a" -> jNull) 
).cursor.fields
res27: Option[List[argonaut.Json.JsonField]] = Some(List(a, b, c)) // where's `c`'s a?

Does a allFields method, which would find all fields from the Json recursively, exist? If not, perhaps a PR could implement that method, and then, using LabelledGeneric and the aforementioned method, we could address this issue?

Finally, if you all find this approach to be technically sound, then perhaps it belongs in https://github.com/alexarchambault/argonaut-shapeless?

seanparsons commented 7 years ago

@kevinmeredith The fields and fieldsSet methods gives you the fields of the object currently in focus. So if you want to check that it has only fields a, b and c you can validate against that.

Hence this:

scala> val testDecoder: DecodeJson[(String, Boolean)] = jdecode2L((a: String, b: Boolean) => (a, b))("a", "b")
testDecoder: argonaut.DecodeJson[(String, Boolean)] = argonaut.DecodeJson$$anon$4@3f59fa4e

scala> val improvedTestDecoder = testDecoder.validate(_.fieldSet == Some(Set("a", "b")), "Wrong fields")
improvedTestDecoder: argonaut.DecodeJson[(String, Boolean)] = argonaut.DecodeJson$$anon$4@45ed447c

scala> improvedTestDecoder.decodeJson(Json("a" := "q", "b" := true))
res7: argonaut.DecodeResult[(String, Boolean)] = DecodeResult(Right((q,true)))

scala> improvedTestDecoder.decodeJson(Json("a" := "q", "b" := true, "c" := 1))
res8: argonaut.DecodeResult[(String, Boolean)] = DecodeResult(Left((Wrong fields,CursorHistory(List()))))
kevinmeredith commented 7 years ago

As I understand your reply, @seanparsons, you believe the LabelledGeneric approach that I proposed in https://github.com/argonaut-io/argonaut/issues/104#issuecomment-313884079 to be unnecessary given the solution that you've provided with DecodeJson#validate?

If so, perhaps I should ask if the application of LabelledGeneric would be appropriate to https://github.com/alexarchambault/argonaut-shapeless, i.e. to derive such a DecodeJson without the need to call DecodeJson#validate manually?

seanparsons commented 7 years ago

@kevinmeredith I was pointing out the manual solution, a generic solution would likely want to invoke the same kind of code. It would definitely be a nice thing to have in argonaut-shapeless but that's not really a question for me! :)

alexarchambault commented 7 years ago

That's definitely do-able with argonaut-shapeless. It has some JsonProductCodec, that can be customized to check for any non-decoded field. One should just roughly adapt back these lines to argonaut-shapeless, and ensure the implicit they define is in scope when the DecodeJson are derived. Then decoding would fail if any extra field is present.