colcon / colcon-ros

Extension for colcon to support ROS packages
http://colcon.readthedocs.io
Apache License 2.0
13 stars 26 forks source link

Defer evaluated conditions? #128

Closed mikepurvis closed 1 year ago

mikepurvis commented 3 years ago

Currently, catkin_pkg conditional dependency evaluation happens pretty much as early as possible, at package-identification time:

https://github.com/colcon/colcon-ros/blob/eb0b932cf7b1f3f5a720804e67f3c1a0f8995ca8/colcon_ros/package_identification/ros.py#L195-L196

In a "normal" colcon usage scenario, it doesn't really matter when these evaluations occur since the resulting PackageDescriptor objects are short-lived and are consumed in the same environment that creates them. However, I've been piggy-backing on colcon's discovery logic to cache these, as an alternative to rosdistro_build_cache, which means that they do become long-lived, and the consuming environment might want to access the dependency information under multiple different environment configurations (eg ROS_VERSION=1 vs ROS_VERSION=2).

I know there are a variety of ways this could be handled on my end, but I wondered if there'd be any enthusiasm for a change which would store the full condition string and result in dependency metadata, and then filter the exclusions later (possibly with a standard excluded metadata or similar)? This would allow the logic from catkin_pkg to still be used, and avoid overly-complicating colcon, but also permit me to handle this conveniently on my side.

cottsay commented 2 years ago

Would pushing the population of package dependencies into a PackageAugmentation extension meet your needs? There seems to be precedent in colcon-core for moving dependency enumeration out of the identification phase already, and I've considered following the same pattern here for similar reasons to the ones you gave.

mikepurvis commented 2 years ago

@cottsay I'm having trouble picturing exactly what your proposal would look like, but definitely it sounds like it would allow the deferral that I'm hoping for, if it would let my distro cacher leverage only the identification part and have something usable as the intermediate object.

Somewhat related, it would be great to also have more access to the "rich" information made available in package format 2, particularly stuff like exec_depend vs build_export_depend and the inclusion of doc_depend. Particularly when building in ultra-strict environments like the Nix sandbox, there are real benefits to properly differentiating the "run" dependencies. If the split you're speaking of allowed the catkin_pkg information to come through fairly unfiltered at the identification stage, that could give some benefits in that area as well— it would just mean the later augmentation stage would then become responsible for the full "dependency selection" job, not just in terms of evaluated conditions, but also stuff like which tags are even considered build or run dependencies in this particular context?

cottsay commented 2 years ago

Here are some links to the python package type that colcon-core is discovering:

Identify: https://github.com/colcon/colcon-core/blob/f6a1d8c7ba4f25fb1bf79d9ceaa4e1b5cb27f0d0/colcon_core/package_identification/python.py#L26 Augment: https://github.com/colcon/colcon-core/blob/f6a1d8c7ba4f25fb1bf79d9ceaa4e1b5cb27f0d0/colcon_core/package_augmentation/python.py#L29

The former is responsible only for populating the name and type on the descriptor, and the latter fills in the dependencies and such. One of the primary reasons for this (as far as I can see) is that a colcon.pkg file will override the identification phase, and including dependency information in the colcon.pkg file might not be necessary. So the corresponding augmentation phase is invoked based on which package type the colcon.pkg specified.

After re-reading your initial request, I'm not sure this is a perfect solution for what you're hoping for. Re-invoking the augmentation phase with different context will merge the new dependencies into each descriptors' existing dependencies, so for this to work right, you'll need to remove all of the dependencies from the descriptors before re-augmenting them. Augmentation also requires that the files still be present, so if you've already removed the temporary location where colcon is discovering the packages, this solution isn't going to fit.

Re: specific dependency types - I'm not sure how much interest doc_depend is from a colcon perspective, but there are definitely better things we could be doing with the export dependencies. Another benefit of the augmentation phase over the identification phase is that colcon now has knowledge of the full set of packages. Rather than adding build_export dependencies as runtime dependencies to a package, we could instead add the dependency directly to all downstream packages which build_depend on it. This delayed pattern is already used for resolving group dependencies, which also require information from other packages to fully resolve.

mikepurvis commented 2 years ago

We have an internal colcon-document extension that's fairly complete at this point and is effectively a port of https://github.com/mikepurvis/catkin_tools_document to the colcon ecosystem. We're considering releasing that anyway, but I imagine doing so could provide a bit more of the motivation for first-class handling of doc_depend?

As to the rest, I may be swimming upriver here— it was really useful in the bootstrapping phase to be able to leverage colcon's package discovery, but I wonder if it might make more sense to do a wholly-separate suite of discovery plugins specific to my needs, rather than trying to bend colcon's to a purpose they really aren't intended for. At the same time, colcon distro caching is really cool, and if we're able to open that up at some point that may help stimulate a more informed discussion about what makes sense as a path forward for it.

paulbovbel commented 1 year ago

I believe we see a similar issue when building bundle workspaces using colcon - repositories like BT (https://github.com/BehaviorTree/BehaviorTree.CPP/blob/master/package.xml#L17) which switch build type based on environment, don't respect the environment set by a package like ros_environment. The consequence is that we have to configure variables like ROS_VERSION before the build even runs.

mikepurvis commented 1 year ago

Oh! I forgot about this thread— we obviously did open up our distro caching tool as part of the ROSCon talk that Ivor and I gave. The PkgDescriptor objects are serialized to JSON and cached in a small sqlite db, keyed to repository information:

https://colcon-distro.readthedocs.io/en/latest/schema.html

This is where we call the colcon discovery/augmentation plugins on local source directories in order to generate the descriptors:

https://github.com/clearpathrobotics/colcon-distro/blob/c21d17c3da469f178b84c98375e728a97055431b/colcon_distro/discovery.py#L17-L31

One other potential point of interest is that I added a RepositoryDescriptor and corresponding augmentation point/metadata to the ecosystem, here:

https://github.com/clearpathrobotics/colcon-distro/blob/main/colcon_distro/repository_descriptor.py

I didn't ever propose this upstream as it seems fairly clearly out of scope for colcon proper, but is obviously of interest to the matter of distro caching since that has to interface directly into the distribution.yaml, where the toplevel granularity is repo rather than package. We obviously leverage this to attach narhash (nix archive) data at both levels:

https://github.com/clearpathrobotics/colcon-nix/blob/65388a366dd7ebfdf14885dd26ce4241b870902c/colcon_nix/augmentation.py#L16-L43

mikepurvis commented 1 year ago

Closing this as we've now pursued a different path for resolving the original issue— rather than deferring the conditions, we generate a per-distro cache entry, where each distro defines its own envvars (so, ROS_VERSION).