conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.24k stars 980 forks source link

[feature] Packageable pip installable Python venv based utilities #8626

Open thorntonryan opened 3 years ago

thorntonryan commented 3 years ago

Description

What if Conan had the capability to capture/package pip installable python utilities (e.g. Sphinx) or thunked scripts and turn them into a package a project could build_requires for use during builds?

My coworker @puetzk has developed a couple nifty little utility packages that allows us to do just that.

Would there be any interest in adding first class support for such a capability in Conan? Maybe under conan.tools.* ?

It's bending Conan's packaging model a little, but the capability is powerful. Feel free to close the issue if it's not in alignment with your project goals -- totally understand.

But we're doing something cool, if not a little outside the box, and I figured it'd be worth floating the concept to see what you think.

As always, thanks for all your hard work! Conan is now an integral part of our workflow. And everyone does an amazing job delivering such a high quality tool! Keep up the great work!

Motivation

One of the unforseen benefits of moving to Conan, has been the mandatory requirement that Python is now a first class prerequisite for most of our C++ projects. This actually gives us greater flexibility in developing build and other utility scripts in Python (instead of bash or bat) for use throughout our build process.

But sometimes utility scripts depend on other pip packages, etc. How do we share utility scripts and make sure everyone has the right packages installed?

Our solution was to leverage Python virtual environments and package them with Conan.

We started by developing a Python venv generator, that would install any pip installable script as a python venv inside your build folder using the same version of Python Conan was using -- that way we could ensure you had a Python environment capable of running our utility script.

But this was a little inefficient. It meant that if you had multiple projects depending on the same "python-*" venv utility, you'd install the same utility into multiple build folders. Totally works, but is a little bit heavy.

Is there a way to better leverage Conan to avoid re-creating these packages?

The bit of a "hack" we settled on, was making these Python venv based utility packages to have a build_policy="missing". That way the package was created once, and lives in your Conan cache. Packages binaries are never uploaded / shared, but are built once, locally, on a given developer's machine and can then be re-used across projects.

But for us, the benefits far outweighed the cons.

We've developed python packages for Sphinx, for processing a custom InnoSetup style template (git ignore style pattern matching for matching Conan dependencies and creating InnoSetup install rules), generating WinSparkle style appcast snippets (including DSA signature checks over our binaries, etc.). If you can write it in Python, you can get at it from Conan.

And again, each of these utility scripts likely depends on a variety of Python packages/libraries that need pip installed, but these venv packages handle all the dependency mess in a convenient way.

Example

For example, my project uses Sphinx to build our documentation.

In my conanfile.py, I have the following:

    build_requires = (
        ...
        "python-sphinx/3.2.1@user/channel",
        ...
    )

And our package has a Modules/python-sphinx-config.cmake, which helps CMake find sphinx-build relative to its package_folder.

cmake_minimum_required(VERSION 3.12)

if(NOT TARGET python3::sphinx-build)
    add_executable(python3::sphinx-build IMPORTED)
    set_target_properties(python3::sphinx-build PROPERTIES
        IMPORTED_LOCATION "${CMAKE_CURRENT_LIST_DIR}/../bin/sphinx-build.exe"
    )
endif()

if(NOT TARGET python3::sphinx-apidoc)
    add_executable(python3::sphinx-apidoc IMPORTED)
    set_target_properties(python3::sphinx-apidoc PROPERTIES
        IMPORTED_LOCATION "${CMAKE_CURRENT_LIST_DIR}/../bin/sphinx-apidoc.exe"
    )
endif()

if(NOT TARGET python3::sphinx-autogen)
    add_executable(python3::sphinx-autogen IMPORTED)
    set_target_properties(python3::sphinx-autogen PROPERTIES
        IMPORTED_LOCATION "${CMAKE_CURRENT_LIST_DIR}/../bin/sphinx-autogen.exe"
    )
endif()

if(NOT TARGET python3::sphinx-quickstart)
    add_executable(python3::sphinx-quickstart IMPORTED)
    set_target_properties(python3::sphinx-quickstart PROPERTIES
        IMPORTED_LOCATION "${CMAKE_CURRENT_LIST_DIR}/../bin/sphinx-quickstart.exe"
    )
endif()

So that way from my CMakeLists.txt, I can reference the target python3::sphinx-build in custom commands, etc. to build my documentation.

add_custom_command(
    ...
    COMMAND python3::sphinx-build # ${python3::sphinx-build}
    ${_args_SOURCEDIR}
    ...
)

No need to install sphinx manually. I just add the conan package reference. And everything "magically" works.

memsharded commented 3 years ago

Hi @thorntonryan

This looks interesting, and the problem of trying to reusing Python code in Conan has been asked a few times, and while python_requires is good for reusing small snippets into the recipes, there is still demand for management of heavier Python tools (installable with pip), and the same nature of python/pip installation doesn't make it possible in a straightforward way in Conan.

So yes, I would love to know more, it would be good to have something to start learning, a proof of concept project, a branch or a pull request, even if only intended for testing and not yet merging... whatever you feel more convenient to start with will be good. It might be possible that it could be complicated to deliver something as built-in if we encounter difficulties, but definitely seems worth trying, so lets do it, thanks very much for the offer!

thorntonryan commented 3 years ago

OK. Will do! Glad to hear there's some interest.

We'll need to do some work on our end first, which may take some time. But I didn't want to start that process if there was no interest. I'll touch base with @puetzk tomorrow and try and come up with a course of action. That way you can see what we're doing with a working example, etc.

I'll report back here once I know more.

thorntonryan commented 3 years ago

Ok, nothing major to report. But started the approval process internally. Sorry for the long delay.

samuel-emrys commented 2 years ago

@thorntonryan how's this going? We're just approaching this problem now and it looks like any solution you may have here would be very useful to us.

thorntonryan commented 2 years ago

@samuel-emrys , Terribly, sorry for the delay.

I've got the internal approvals cleared on my end.

I had a bit of a window and lost it due to other work items taking priority.

We certainly need to revisit this in light of Conan 2.0 to see if/how it will be impacted. So this is a good reminder to revisit this subject and at least push what we have describing how to use it so that others can determine whether there's anything worth cribbing/sharing.

Thanks for your patience!

thorntonryan commented 2 years ago

@samuel-emrys and @memsharded , I've got an example repo setup here to illustrate our approach to having conan packageable pip venv based utilities: https://github.com/thorntonryan/conan-pyvenv

It requires CMake and Ninja and I've only tested it with MSVC compiler

But I think the project is simple enough to illustrate the point of our approach

We haven't moved to the new cmake_find_package generator, so forgive the use of the old cmake generator. I'll need to figure out why the new generator doesn't work with the packaged / supporting CMake files.

samuel-emrys commented 2 years ago

I've refactored @thorntonryan's work to be compliant with the way custom generators should be used in conan 2.0 to the best of my understanding. I've get a few exemplar repositories to demonstrate this:

I think that there might be a place for this in the conan library itself to provide some of these utilities more broadly, rather than just a custom generator (which as far as I can see, aren't currently accepted for distribution via CCI?), so my intent is to re-organise some of this into a draft PR for further review/discussion on better implementations/approaches

thorntonryan commented 2 years ago

A few other caveats worth mentioning about the approach outlined above.

The python venvs aren't fully reproducible and don't take advantage of Pipfile.lock or anything. We make no claims to offer full reproducibility within the dependency tree of each high level pip requires.

This means two different installs of the same pip installable package may have different dependencies.


Since the generators simply "import" the pyvenv utility into the current conanfile.py: https://github.com/thorntonryan/conan-pyvenv/blob/32d2a6a3820afd5b9405eb5216ff3be82487cbe0/pyvenv/conanfile.py#L6

This doesn't always play nicely with Conan if packages have multiple versions of pyvenv in play. We've certainly ran into scenarios where pyvenv/0.3 introduced breaking changes to a function signature that was present in pyvenv/0.2, and depending on which way Conan resolved dependencies, you could have conflicts in which version of pyvenv was actually imported into Conan's namespace, if your requirements happened to have both pyvenv/0.2 and pyvenv/0.3 in play at the same time.

There's probably way to more carefully control how we import these helpers, but those sorts of things should be taken into careful consideration there's any desire to further productionize what we've started.


And also, obviously, the venv packages are worthless if you happen to uninstall the underlying Python they were based on. I don't think there's anything that catches this until you try and use a package that points to a place on disk that no longer exists, which means the package needs to be rebuilt, etc.

thorntonryan commented 2 years ago

FWIW, @puetz was the primary developer, and he might offer more known pitfalls / nits, but those are the known caveats I think we're aware of (and choose to live with).

samuel-emrys commented 2 years ago

The python venvs aren't fully reproducible and don't take advantage of Pipfile.lock or anything. We make no claims to offer full reproducibility within the dependency tree of each high level pip requires.

This means two different installs of the same pip installable package may have different dependencies.

I'm not sure that this should be the concern of python facilities at this level of abstraction within conan. In order to deal with that, we would need to essentially write a recipe for each individual python package, or around something like pipenv.

Even so, it is still possible to ensure reproducibility with tools like pip-tools. Specifically, pip-compile can be used to resolve the dependency tree for a list of high level dependencies specified in a requirements.in file, and generate a requirements.txt file. The output of a tool like this is sorted alphabetically, so when used with the python-virtualenv package I developed an exemplar for above, you can do something like this:

    def requirements(self):
        self.requires("python-virtualenv/system")
        with pathlib.Path("requirements.txt").open() as requirements_txt:
            self.options["python-virtualenv"].requirements = json.dumps([
                str(requirement) for requirement in pkg_resources.parse_requirements(requirements_txt)
            ])

Because all the versions are pinned (even hashed if you want OS independence), and it's alphabetical, you can have confidence that the json string that's being generated here will only vary if there's a meaningful change to requirements.txt, and get a reasonably reproducible environment. It's a soft guarantee and certainly well within the domain of good user behaviour, though.

Since the generators simply "import" the pyvenv utility into the current conanfile.py: https://github.com/thorntonryan/conan-pyvenv/blob/32d2a6a3820afd5b9405eb5216ff3be82487cbe0/pyvenv/conanfile.py#L6

This doesn't always play nicely with Conan if packages have multiple versions of pyvenv in play.

I think that this is probably resolved by part of the refactor I did, which was to utilise python_requires to share the key functional classes around, and would also be resolved if these facilities were merged into the conan codebase.

And also, obviously, the venv packages are worthless if you happen to uninstall the underlying Python they were based on. I don't think there's anything that catches this until you try and use a package that points to a place on disk that no longer exists, which means the package needs to be rebuilt, etc.

Yeah, I think this is just a danger associated with upgrading python and using virtualenvs generally. Again, I think that this should probably be outside of the consideration of conan. That is, and I haven't looked into this, unless there's a way of building a python interpreter using conan - is that something that the cpython recipe provides? That would help us gain some separation from any system installation.

thorntonryan commented 2 years ago

I think that this is probably resolved by part of the refactor I did

Agreed. Actually, I think just moving it into conan itself likely resolves the issue. That way there's just once source of truth for the pyevenv implementation -- not multiple sources depending which package you pulled in.

Good work!