Closed mamazu closed 1 year ago
The pipeline is failing for code I didn't touch. Should I fix that as well?
if you have some time to spare to fix the phpstan and cs fixer complaints i would of course appreciate that ;-)
@dbu, @mamazu, @chirimoya I'm doing some tests about improving deleting properties. As in the previous performance improvements the best thing is completely avoid \DOMDocument
. I did a quick benchmark and the best solution as always when working with PHP and XML go the workaround over xml_parse
.
The Benchmark results and also different implementation of DOMDocument
are available (2.0 vs 1.9) and some little improvements on query.
https://github.com/alexander-schranz/jackalope-xml-delete-properties
impressive! we do have quite some test coverage, though edge cases of strange node names and such is probably not extensively covered, if such things become tricky with xml_parse.
should we get this in for 2.0 ?
it sounds like a big change, so i am a bit wary of doing it as 2.1. we could also support both and have an option to enable the xml_parse mode but i would prefer if we can be confident with the new mode and not have to support both - the code is complex enough as it is :-)
I personally even would do it for 1.x. We will sure do some more testing, but currently I did implement the deletion on real world data homepage with about 40 different languages (including chinese) and 70000 properties so a very complex phpcr node and the resulted xml is 1to1 the same as the current implementation. What I did not yet test is special chars in props names, but if I remember correctly CR is there already very strict and only allow specific chars as property names.
We did already similar things when patching reading the data in 1.7.0: https://github.com/jackalope/jackalope-doctrine-dbal/commit/68392374fc4c4ea5d2380baf486e11a8ca700caa
I found another improvement in the
deleteProperties
function. In the old code it would run a node id query for every path you want to delete. Now, it groups the properties by node and then queries the database once for an id.