Closed dktapps closed 2 years ago
Digging, seems like this is quite an inconvenient problem to solve. My solution would be to embed the patterns as base64 blobs instead of as bare strings, and then generate code like preg_match(base64_decode('AAAAAAA==='))
instead, so that PHP string escaping doesn't get a chance to screw anything up.
Hi @dktapps
Thanks for the report! I'll have a look into it later.
After some testig I've adopted your idea of using encoded strings rather than the regular expression itself in the generated code. I've also added a test case including a regular expression with backslashes´and forward slashes to test escaping and the en-/decoding.
Additional escaping shouldn't be needed at all if you base64, no? It's only needed in the first place to prevent PHP interpreting it differently as a const string.
Yeah I thought the same way first but it's wrong (compare https://3v4l.org/YLRvS which matches a single backslash).
Also forward slashes must be escaped as they are used as delimiter of the regular expression.
In the sample I gave, the 4 backslashes are intended to match 1 backslash (JSON \\\\
resolves to \\
when decoded, passed to PCRE, which then turns it into literal \
). I'm just wondering if that might break something.
Also forward slashes must be escaped as they are used as delimiter of the regular expression.
preg_quote() might be better for that (specifically intended for purpose), though I don't suppose it matters much.
I just tested dev-master with my schema + data and it works.. I'll never understand this godawful mess of backslashes. Thank you for the fix.
Yeah escaping is always hard, in this case on three layers. The JSON Schema, the PHP code and the PCRE engine. Glad it works now as expected.
I've thought about it again and you are absolutely right.
Four backslashes in the JSON mean two backslashes in the regular expression due to JSON escaping. Two backslashes in the regular expression must match one backslash in the tested string due to PCRE escaping. In the current implementation four backslashes in the JSON match two backslashes in the tested string which is wrong. It's so obscure.
But the delimiter escaping must be kept 😄
Edit: changed in https://github.com/wol-soft/php-json-schema-model-generator/commit/7dbf81a3c59ee87708168633e25760d8d912972b
Describe the bug
This schema causes the following code to be generated:
This pattern is meant to allow backslashes, hence the 4
\\\\
(to match the literal backslash character).You can see that the backslash given to preg_match() is incorrectly escaped: there are only 2 backslashes in the resulting string, which is interpreted by PHP as a single backslash even in a single-quoted string.
Expected behavior The backslash pattern above should be correctly escaped.
Schema plugin-schema.json.txt
Version: 0.21.0