sergiobayona / easy_talk

Ruby gem for defining and generating JSON Schema
MIT License
15 stars 1 forks source link

A few questions about schema validation #22

Open carlost opened 2 months ago

carlost commented 2 months ago

I ran across a few things I was a little unsure about while integrating easy_talk into our app. I was hoping you have the time to answer a few questions I had:

JSON Validation

The readme mentions that "EasyTalk does not yet perform JSON validation". Can you confirm that this is no longer the case?

I think this was resolved when you added json-schema in release 0.1.2: https://github.com/sergiobayona/easy_talk/commit/74164669e3e3d2fcbad5cc97ebdb4edb42f1cea8#diff-db2f18104e8129484ced3b2cbe64479d338d247094016629bc8c2682b5cfd93dR85-R87

I just want to make sure I am understanding this correctly.

Unexpected root level attributes raises ActiveModel::UnknownAttributeError

Using spec/easy_talk/schema_validation_spec.rb as a reference, this passes:

it 'errors on a non-existent property' do expect do user.new(surprise: 'yah-got-me', name: 'Jim', age: 30, height: 5.9, email: { address: 'jim@test.com', verified: false }) end.to raise_error(ActiveModel::UnknownAttributeError, /unknown attribute 'surprise'/) end

This occurs even if the "user" Class has additionalProperties explicitly set to true.

However, this does not occur if the unexpected attribute is nested. For instance:

user.new(name: 'Jim', age: 30, height: 5.9, email: { surprise: 'nested k/v', address: 'jim@test.com', verified: false })

Does not raise an error, and its validity is dependent on additionalProperties.

Is this behavior expected? The behavior feels a little surprising because it isn't implied by the API/docs and goes against the expectation when explicitly setting "additionalProperties true". This feels like it is just an unintended consequence of including ActiveModel::API in EasyTalk::Model, which replaces the default initializer with AR's initializer. ActiveModel::API#initialize freaks out if it is passed an unexpected root level attribute.

validation errors for errors and validation_context when additionalProperties is set to false

When additionalProperties is set to false, it looks like the JSON validation is treating the validation_context and errors attributes added by ActiveModel::Validations as if they are values submitted by JSON.

["Validation context object property at /validation_context is a disallowed additional property", "Errors object property at /errors is a disallowed additional property"]

This feels like a bug. Is this behavior expected?

Is there a way to nest the JSON schema definition for the contents of arrays?

When the JSON schema assigns an object to a property, you can specify the the schema for the nested object inline as:

property :email, :object do property :address, String property :verified, T::Boolean end

Is it possible to do this when the property is an array of objects? Something like:

property :emails, T::Array do property :address, String property :verified, T::Boolean end

I also wanted to just thank you for publishing easy_talk. We were able to finally get AWS Bedrock reliably generating the JSON output we needed because of it!

sergiobayona commented 2 months ago

Thanks for this useful feedback Carlos. Regarding json schema validation, the latest version of the gem (v0.2.2) does validate json schema. It piggybacks on the json-schemer gem and uses the ActiveModel api to report back the errors. Take a look at: https://github.com/sergiobayona/easy_talk/blob/main/spec/easy_talk/schema_validation_spec.rb https://github.com/sergiobayona/easy_talk/blob/main/spec/easy_talk/activemodel_integration_spec.rb

sergiobayona commented 2 months ago

regarding the second part of your message:

"validation errors for errors and validation_context when additionalProperties is set to false"

It looks like you found a bug. I'm creating a bug issue and will look into it soon.

sergiobayona commented 2 months ago

as far as nesting an object as an array, yes it possible. You'd define the nested object as a model. Then reference it as a T::Array[Object] where the Object is the name of the model. Like:

class Email
  include EasyTalk::Model

  define_schema do
    property :address, String
    property :verified, T::Boolean
  end
end

class User
  include EasyTalk::Model

  define_schema do
    property :emails, T::Array[Email]
  end
end

Here is a similary spec https://github.com/sergiobayona/easy_talk/blob/main/spec/easy_talk/examples/user_model_spec.rb#L107-L122 for reference.

carlost commented 2 months ago

@sergiobayona thanks for looking into this. The only other question I had was related to the ActiveModel::UnknownAttributeError raised when passing the model's initializer a hash with an unexpected root level attribute. The difference in how the model treats an unexpected root level attribute by raising an error while an unexpected nested attribute is flagged as a validation error is a little surprising.

Is this behavior expected?