jsonrainbow / json-schema

PHP implementation of JSON schema. Fork of the http://jsonschemaphpv.sourceforge.net/ project
MIT License
3.55k stars 354 forks source link

Issues with Zoopla's schema #499

Closed jolleychris closed 7 months ago

jolleychris commented 6 years ago

Hi,

If I validate against Zoopla's list/update schema here, i always get errors on one field (description->0->text) Whereas is I submit to them, it goes through ok. Very frustrating, as an error in submission restarts their error count, and its 7 days error free in order to go live with them! I'd rather be validating before submitting to avoid this.

That aside, I tried putting the schema in as a local file, but now with the same data, every property is invalid! Any ideas why that'd be?

Their schema: https://realtime-listings.webservices.zpg.co.uk/docs/latest/schemas/listing/update.json

                // Validate
                $validator = new \JsonSchema\Validator;
                $validator->validate($objRequest->property, (object)['$ref' => 'file:///var/www/..some path../schema/update.json']);

Any help would be appreciated! Here's some info from Zoopla themselves about their schema and regex and PHP... I dont quite understand how I can work around it. Ideas welcome there too.

"The PHP library issue have only come to light recently and due to our perl backend system which does not (at present) use ECMA 262 regular expressions, we've end up with a situation where due to the JSON spec being quite lenient in this respect

ok: zpg regex -> JSON spec ok: php ECMA 262 parsers -> JSON spec not ok: zpg regex < - > php ECMA 262 parsers

from an earlier message:

If you are using php do not validate on your side unless you want to extend the validators regex handling then please note that I have taken a look at the recommended list on JSONschema or. All of them seem to make use of phps internal regex syntax "" preg_match "". Since PHP regex strings need delimiters (/myregex/ or variants thereof) coupled with the JSON spec not explicitly demanding this (as most implementations would use JavaScript(ish) regex ECMA 262 which does make use of delimiters)"

erayd commented 6 years ago

Could you please post a more specific example of the problem? What you've provided is their entire schema, no data, and no error message content. Additionally, their schema does not define any property named "description" - to which property are you referring?

If possible, could you please provide a minimal piece of sample data which triggers an error against that schema, the entire error, and which version of this library you are using for validation. This will allow me to track down the cause of your issue and determine whether there is a bug.

Regarding the regex thing

The fundamental issue is that PHP does not know how to handle ECMA-262 regular expressions (PHP uses PCRE). Rudimentary conversion logic is possible, which is what allows it to work at all - but full translation between ECMA-262 and PCRE is an extremely challenging task, and one that so far I have not seen properly solved anywhere.

Most applications and libraries that have such a requirement generally just force the big-picture syntax to conform (e.g. by fixing delimiter requirements, modifiers etc.), and then run the expression directly without any further changes. Most of the time, the expression is such that it will successfully run in both an ECMA-262 engine and a PCRE engine - and because this is enough for most cases, there's not much incentive to spend an inordinate amount of development time to solve it properly, when a 'proper' solution is only needed occasionally, and can usually be worked around.

Unfortunately, based on your post, it looks like they're saying that the expressions used by zpg regex are not able to be properly executed as PCRE, even when roughly translated (i.e. they are one of the rare edge cases). They also seem to be aware that this is an extremely hard problem to solve, so they are recommending as a workaround that you avoid client-side validation that relies on regular expressions being correctly interpreted.

jolleychris commented 6 years ago

Apologies, i managed to miss the word 'detailed_' off the detailed_description property. I thought it would still be clear from the schema though. My bad.

Thanks for taking the time to write a great reply, sorry for the lack of detail on my part. I cant really share the object that we are validating, but here is the resulting pile of error messages from the above code when we try with a local copy of the schema file that I linked to. This happens with pretty much any content we try by the way.

JSON does not validate. Violations: [detailed_description[0].text] Does not match the regex pattern ^\S(|(.|\n)*\S)\Z [branch_name] The property branch_name is required [] The property category is not defined and the definition does not allow additional properties [] The property display_address is not defined and the definition does not allow additional properties [] The property listing_reference is not defined and the definition does not allow additional properties [] The property detailed_description is not defined and the definition does not allow additional properties [] The property life_cycle_status is not defined and the definition does not allow additional properties [] The property property_type is not defined and the definition does not allow additional properties [] The property pricing is not defined and the definition does not allow additional properties [] The property administration_fees is not defined and the definition does not allow additional properties [] The property deposit is not defined and the definition does not allow additional properties [] The property summary_description is not defined and the definition does not allow additional properties [] The property total_bedrooms is not defined and the definition does not allow additional properties [] The property bathrooms is not defined and the definition does not allow additional properties [] The property living_rooms is not defined and the definition does not allow additional properties [] The property available_from_date is not defined and the definition does not allow additional properties [] The property pets_allowed is not defined and the definition does not allow additional properties [] The property tenant_eligibility is not defined and the definition does not allow additional properties [] The property content is not defined and the definition does not allow additional properties

jolleychris commented 6 years ago

Aside from the above, when we do not use a local copy of the schema file, but instead use a reference to it with a http:// address, we see the following error only:

JSON does not validate. Violations: [detailed_description[0].text] Does not match the regex pattern ^\S(|(.|\n)*\S)\Z

erayd commented 6 years ago

I cant really share the object that we are validating

Then could you please provide a dummy object that triggers the problem? The data doesn't need to be real data, it just needs to be something that I can look at to see what's going on.

jolleychris commented 6 years ago

If you don't mind sharing an email address with me, I could send you a real object example. It would be absolutely amazing if you can find a way around, or a means to improve our current situation, so i'd be more than happy to send privately

erayd commented 6 years ago

You're welcome to email me an example object - my email address is in the commit history for this project.

DannyvdSluijs commented 9 months ago

@jolleychris in an attempt to cleanup this repo we are trying to filter the issues and see which ones might be closed. Is it safe to assume this is a rather old issue, which sadly was left unanswered, and can be closed? Feel free to close it yourself with some comments if helpful.