Closed schuylermartin45 closed 1 year ago
This is not the world's best patch, but I want to reduce the blast-radius of these changes for now.
Previously, providing the following valid JSON patch to a recipe:
{ "op": "replace", "path": "@output/build/number", "value": "0", }
would result in
Recipe.patch()
wiping out the entire block ofbuild
sub-keys.I'm not entirely sure why that was happening, but I believe it had to do with defaulting the
match
variable to.*
which caused the current algorithm to find every line under build and wipe it completely:build: 0 0
Tested changes against
perseverance-skills
.After having run into many issues using
percy
, IMO we need to:
- Make
patch
operations "atomic" (i.e..patch()
should not render and save changes. To put another way, patching should maintain it's in-memory changes and the caller should be responsible for writing to disk. This is to save disk I/O operations when we start making lots of changes AND allows a program to ask a user if they would like the changes that have been made.In general, make
percy
RFC 6902 compliant as much as we possibly can.
- This will reduce confusion from anyone making calls to
percy
- Custom concepts like
@object
are not well documented and do not seem entirely necessary.- Add significant testing automation (primarily in the form of unit testing) to ensure reliability of the library AND to prevent regressions.
- Reduce/eliminate the use of
deepcopy()
for performance and code clarity. Especially as it seems to be used heavily for logging purposes.- Reduce/eliminate the injection of custom information into the user-provided patch object in favor of using more variables. This correlates heavily with the previous point.
I can gladly raise these as GitHub issues, but I wanted them to be written down in some form. I do not likely have the time to implement all of these ideas before I get a first version of
recipe-bootstrapper
out, but I am concerned about the long-term stability ofrecipe-bootstrapper
andpercy
if these points are not addressed.
JC stumbled upon the same issue when working on another topic. This PR should address it: https://github.com/anaconda-distribution/percy/pull/40
To address some of your comments:
What I realize is that to use this patch implementation efficiently, one still need to be a regex master. At the match regex influence a lot the results.
Some additional information:
Example to append patches:
[
{
"op": "add",
"path": "source/patches",
"match": "(aa.patch)|(bb.patch)",
"value": ["aa.patch", "bb.patch"]
}
]
Examples to replace cython
or cython 0.29.23
by cython 0.29
in any package of the recipe:
[
{
"op": "replace",
"path": "@output/requirements/host",
"match": "cython\\s+0.29.\\d+",
"value": ["cython 0.29"]
},
{
"op": "replace",
"path": "@output/requirements/host",
"match": "cython\\s*($|[^#])",
"value": ["cython 0.29"]
}
]
Example to update a build number:
[{"op": "replace", "path": "build/number", "value": "0"}, {"op": "replace", "path": "@output/build/number", "value": "0"}]
@cbouss I got pretty far with the parse-tree implementation late last week. For a day and a half of dev time, I think it's very promising. I can already go from YAML -> tree -> YAML losslessly on a simple example.
The work is here and I plan to continue to work through test cases this week: https://github.com/anaconda-distribution/percy/tree/schuy_recipe_parser
I'm going to start with the replace
patch operation first as it addresses what I need in recipe-bootstrapper
. I think I can match the RFC pretty closely and potentially expand upon it for our needs.
If you would like to discuss further about this, we can hop on a call, maybe with JC and Mark as well, but I think the parse tree method is going to be a very powerful tool. I think you saw my stats from last week, but I think we can cover about 90% of recipes with it and maybe fall back to a regex mechanism for the other 10%. And that last 10% is not likely to be automated anytime soon.
https://github.com/anaconda-distribution/percy/tree/schuy_recipe_parser
That's cool.
So what would be needed eventually is:
If you haven't looked at it, ruamel.yaml.jinja2 does handle some of the preserving part of jinja. But it doesn't allow updating jinja itself. ruamel parser also has editing capabilities on comments (useful since selectors are comments)
A meeting with a few people would be useful. Perhaps we should define some use cases.
- Ability to update yaml values
- Ability to update jinja variables
- Ability to update selectors variables
I can confidently handle those first two points with that branch/already have plans to handle those cases. The third one I have not addressed and the parse tree is storing those as "comments" per node in the tree. I don't need selectors for my current project needs, but I don't see an issue with parsing the "comment" into a more semantic form.
When I get a chance, I'll find time on the calendar to get a sync meeting between you, me, JC, and Mark.
Ah, I mostly needed 1 and 3 so far :)
To keep all links in one place, this is the conda recipe proposal change to address issues regarding manipulating recipes: https://github.com/conda-incubator/ceps/pull/54
- Ability to update yaml values
- Ability to update jinja variables
- Ability to update selectors variables
I would add to "Ability to update selectors variables" that we should be able to add, remove and update selectors without involving any kind of environment evaluation. For example, when I ask to add a selector, I want it to be applied no matter what variants are currently available.
I'm going to close this out, given the parse tree work.
This is not the world's best patch, but I want to reduce the blast-radius of these changes for now.
Previously, providing the following valid JSON patch to a recipe:
would result in
Recipe.patch()
wiping out the entire block ofbuild
sub-keys.I'm not entirely sure why that was happening, but I believe it had to do with defaulting the
match
variable to.*
which caused the current algorithm to find every line under build and wipe it completely:Tested changes against
perseverance-skills
.After having run into many issues using
percy
, IMO we need to:patch
operations "atomic" (i.e..patch()
should not render and save changes. To put another way, patching should maintain it's in-memory changes and the caller should be responsible for writing to disk. This is to save disk I/O operations when we start making lots of changes AND allows a program to ask a user if they would like the changes that have been made.percy
RFC 6902 compliant as much as we possibly can.percy
@object
are not well documented and do not seem entirely necessary.deepcopy()
for performance and code clarity. Especially as it seems to be used heavily for logging purposes.I can gladly raise these as GitHub issues, but I wanted them to be written down in some form. I do not likely have the time to implement all of these ideas before I get a first version of
recipe-bootstrapper
out, but I am concerned about the long-term stability ofrecipe-bootstrapper
andpercy
if these points are not addressed.