michaelplatingsarm commented 2 years ago

The SCM feature by design can only be used with the repo containing the recipe. My team would like something similar but we use multiple repos, some quite large.

8760 advises:

[...] a hard-coded commit hash for the sources method. The hash can be hard-coded in a few places: 1. the recipe, 2. a text file like conandata.yml and then read from disk and used inside the source() method.

Hard-coding isn't an option since our sources are updated many times each hour. I think we have to roll our own multi-scm feature, either by (a) by writing a script to generate conandata.yml with the revisions baked in; or (b) have the conanfile itself capture the revisions.

The SCM feature does is tantalizingly close to what we need - as I understand it at export time it runs git rev-parse HEAD and stores that in conandata.yml. I've taken a similar approach:

conanfile.py:

# Example of using MultiSourceConanFile

from conans import CMake
from multisource import MultiSourceConanFile, GitSource

class HelloConan(MultiSourceConanFile):
    name = "hello"
    version = "0.1"
    settings = "os", "compiler", "build_type", "arch"

    def build(self):
        cmake = CMake(self)
        cmake.configure(source_folder="hello")
        cmake.build()

    def package(self):
        self.copy("*.a", dst="lib", keep_path=False)

    def get_sources(self):
        return [
            GitSource("https://github.com/conan-io/hello.git", "hello"),
            # ...
        ]

multisource.py:

import os
import subprocess
from typing import List
import yaml
from conans import ConanFile, tools

class GitSource:
    def __init__(self, url, folder):
        self.url = url
        self.folder = folder

    def source(self, revision=None):
        # Run git clone and optionally check out the specified revision
        git = tools.Git(folder=self.folder)
        git.clone(self.url)
        if revision:
            git.checkout(revision)

    def get_revision(self):
        # Find the revision that's currently checked out.
        # Conan calls export() before calling source() so we must be
        # tolerant of the source not yet existing.
        if not os.path.exists(self.folder):
            return None
        return tools.Git(folder=self.folder).get_revision()

class MultiSourceConanFile(ConanFile):
    exports = ["multisource.py"]
    no_copy_source = True

    def get_sources(self) -> List[GitSource]:
        raise NotImplementedError("Override this to return a list of sources")

    def source(self):
        """
        This method does the actual checkout.
        Derived classes should not implement source() but should
        instead implement a get_sources() method that returns a list
        of source objects.
        """

        #
        if self.conan_data:
            source_revisions = self.conan_data.get("source_revisions", {})
        else:
            source_revisions = {}

        for s in self.get_sources():
            s.source(source_revisions.get(s.folder))

    def export(self):
        # Update conandata.yml with the current source revisions.

        source_revisions = {}

        # Assuming that the recipe is stored in a Git repo at the root
        # of the source folder, find the source folder.
        # We assume that export won't be called once the recipe has
        # already been exported.
        source_folder = os.path.dirname(
            subprocess.check_output(
                ["git", "rev-parse", "--show-toplevel"]
            ).decode()
        )

        with tools.chdir(source_folder):
            for s in self.get_sources():
                revision = s.get_revision()
                if revision:
                    source_revisions[s.folder] = revision

        conan_data = self.conan_data or {}
        conan_data["source_revisions"] = source_revisions

        tools.save(
            os.path.join(self.export_folder, "conandata.yml"),
            yaml.dump(conan_data),
        )

This works, but there are a couple of smells:

You need to run conan source before conan create, and since conan source calls export(), the export() method itself can't assert that the sources must be present. (For my team that isn't a problem because we're in the habit of using conan install/source/build/export-pkg anyway).
conan create and conan export don't take a source folder argument so we have to infer the source folder from the recipe location.

Any guidance for the best way forward here?

[x] I've read the CONTRIBUTING guide.

memsharded commented 2 years ago

Hi @michaelplatingsarm

I am trying to understand this use case, but I am struggling a bit, so a couple of questions: The scm feature with "auto", to obtain the current repo commit is completely self-defined. The current repo checkout defines the commit completely. But if you don't reference your current repo, then don't you already need the full correct "coordinates" of the different repos to checkout in order to get them? And if you already have defined them, why it would be necessary to capture? What am I missing here? I think this is the key to be able to understand this, but please let me know.

michaelplatingsarm commented 2 years ago

Hi @memsharded thanks for taking a look.

don't you already need the full correct "coordinates" of the different repos to checkout in order to get them

At the point the recipe is exported, the "coordinates" are the HEADs of the different repos. Later on, when we want to recreate the package we don't want HEAD, we want whatever the revisions were when the recipe was exported.

Does that make it clearer?

memsharded commented 2 years ago

Ok, I think I start to understand it a bit better. Another quick question: I guess you have considered git submodules in the past, haven't you? Because this apparently seems kind of a re-implementation of git submodules. If instead of doing this you added the different Git sources as submodules of the repo containing the recipe, wouldn't this achieve the flow that you intend?

michaelplatingsarm commented 2 years ago

Yes I've considered it, and you're right that in principle it would solve the problem. But I've had my fingers burned in the past by Mercurial subrepos and articles like these suggest that Git submodules are no better:

conan-io / conan

[question] Capturing revisions from multiple source repos #10256

8760 advises: