conda-forge / conda-smithy

The tool for managing conda-forge feedstocks.
https://conda-forge.org/
BSD 3-Clause "New" or "Revised" License
146 stars 169 forks source link

BUG render variants regardless of filename and don't stomp on zip keys when reordering #1921

Closed beckermr closed 2 months ago

beckermr commented 2 months ago

Checklist

This one is #1918 but without the logging changes.

Closes #1917 Closes https://github.com/conda-forge/conda-forge-pinning-feedstock/issues/4190

Here is some more info from the doc string in the PR:

When conda-build renders a recipe, it returns the metadata for each unique file generated. If a key
we use at the top-level in a multi-output recipe does not explicitly impact one of the recipe outputs
(i.e., an output's recipe does use that key), then conda-build will not return all of the variants
for that key.

This behavior is not what we do in conda-forge (i.e., we want all variants that are not explicitly
skipped even if some of the keys in the variants are not explicitly used in an output).

The most robust way to handle this is to write a custom function that returns metadata for each of
the variants in the full exploded matrix that involve a key used by the recipe anywhere,
except the ones that the recipe skips.

Historically (as far as I can tell by reading the code and I could be wrong!), we have been coding around this through the preserve_top_level_loops logic/code. However, that logic breaks for zipped keys that are unused due to how we have to reorder keys to match input keys.

In the process of trying to fix this last thing, I hit a weird edge case of a single key of a zip of multiple keys being the used at the top level in a recipe with this key not being used by any output. In this case, the key was zipped and but the actual values used were not recorded anywhere because no output used the key. Simply putting that key back is dangerous because it isn't clear what the zip length or ordering should be.

The most robust fix to all of this is to preserve all of the variants upfront even if they don't generate a unique filename. Coding around edge cases is bug prone as we've seen.

I have left the rest of the preserve_top_level_loops logic in place, even though with this PR I think we could dispense with it all together and simplify. This simplification task can be left to a future PR. The preservation logic is robust to these changes anyways and is now (I think) an effective no-op in most cases.

beckermr commented 2 months ago

@h-vetinari @jakirkham @conda-forge/core this one is ready for review!

beckermr commented 2 months ago

I don't know enough about the conda-build API to really review that _conda_build_api_render_for_smithy does the right thing unfortunately...

I did a bunch of manual testing of that function in a notebook. I will add a unit test based on this to the PR to help keep things more robust.

beckermr commented 2 months ago

ok @h-vetinari @conda-forge/core I have added an additional test that illustrates the behavior of the new function I have added and ensures it is working properly. This one is ready for another review!

beckermr commented 2 months ago

If there is no substantive feedback in a week, I am going to merge this one. I don't want to let it sit since the bugs it solves are somewhat serious.

cc @conda-forge/core

jakirkham commented 2 months ago

Thanks Matt! 🙏

Do we have an example feedstock PR of these changes (if that is possible)?

beckermr commented 2 months ago

We tried them on scipy with numpy2 and it worked.

beckermr commented 2 months ago

The bug is intermittent so showing a pr won't look especially different. It's supposed to look normal. :)

h-vetinari commented 2 months ago

Do we have an example feedstock PR of these changes (if that is possible)?

The one way to trigger it quite reliably is to add skip: true # [python_impl == "pypy"] on a pypy-migrated feedstock and then rerender. Some of the config files will then have a wrong python_impl.

In fact, if you rerender the current scipy-feedstock with the latest released smithy, you'll see the change (because the feedstock is already rerendered with this branch)

beckermr commented 2 months ago

The bug depends on the python hash seed and so is intermittent.

h-vetinari commented 2 months ago

The bug depends on the python hash seed and so is intermittent.

For the feedstocks in question, it reproduced for me on every rerender. It was just random which exact CI config got wrongly populated.