Open jrgleason opened 3 weeks ago
Problem is also true of ChangeValue
Ok more info, it wasnt the amount of files but rather 1 file at 1.6mb. I would still recommend the optional regex param as a way to mitigate it as it tears through the file way faster.
Thanks for the added details here @jrgleason ! Let us know if you're indeed cleared to contribute. Ideally we improve performance of that JsonPathMatcher.matches without any additional arguments, but I'm not sure if that has been profiled and tried before: https://github.com/openrewrite/rewrite/blob/a126fc846e3928467f4d12a7c70fc5df21208027/rewrite-yaml/src/main/java/org/openrewrite/yaml/JsonPathMatcher.java#L84-L103
For context: did your large yaml file perhaps contain anchors? I see those are expanded, so they can also add to the complexity and run duration.
The support to resolve references to YAML anchors was added way back in #2272. I think this will definitely cause performance issues (for multiple reasons). I am thinking we should maybe just remove support for YAML references for now, because I get the impression that YAML references aren't correctly implemented anyway. Alternatively, we could look into fixing the performance issues it is causing.
@jrgleason I've improved the performance of the JsonPathMatcher
in the following commits: https://github.com/openrewrite/rewrite/commit/0f76203bede5f77c0df557f42c7bd5627f9b8c4b, https://github.com/openrewrite/rewrite/commit/f71dc0d429184ba7094715a0cf6dd51e17873e2f, https://github.com/openrewrite/rewrite/commit/ae9083973ca333639ffc4b2231822bbdf7da4e6e, https://github.com/openrewrite/rewrite/commit/0f8e755b55f2c90443e1849e28a0c62c47476d46, https://github.com/openrewrite/rewrite/commit/f1f48c08843bacfa0848bba0298314c4e323652f, and https://github.com/openrewrite/rewrite/commit/f30cd55a8de23ff9823dc1cf2a40343ac4b80edc. These changes should make the matching quite a bit faster. Are you able to check if this improves things for you? There is still more that can be done.
What problem are you trying to solve?
I have a project with a bunch of YAML files. When I try to add a recipe that runs ChangeKey against a complex
oldKeyPath
it causes huge lags (in my case a dry run goes from about 6 mins to over 1 hour). To get around this I started creating a simple precondition that scans the whole file for a string (typically the key or another key that is more unique). I understand this has some pitfalls and also doesn't help is a 3rd party recipe uses it and doesn't upgrade so I was hoping we could come up with a better solution.Describe the solution you'd like
Maybe a parameter to the recipe that allows us to pass a search string as well to run the regex before the matcher. It should be done before the matcher is run here...
https://github.com/openrewrite/rewrite/blob/420f56dd4ab3469db4855a2bf37c443b8e8af9f5/rewrite-yaml/src/main/java/org/openrewrite/yaml/ChangeKey.java#L61
Have you considered any alternatives or workarounds?
My work around, as I said, is to use a precondition to check the file for some text before running the matcher
Additional context
I found this issue through my company and I am working on trying to get approved to contribute. When this happens I should be able to provide an example. Till then I will try to recreate one on my own but in theory this should be true of any project with a lot of yaml files (maybe even just a lot of files).
Are you interested in contributing this feature to OpenRewrite?
Sure but need to finish clearance first