conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.17k stars 974 forks source link

[bug] Recipes with invalid remote URLs in conandata can be uploaded #16591

Closed jasal82 closed 2 months ago

jasal82 commented 3 months ago

Describe the bug

Conan 1.64.1

How to reproduce it

When using the export()/source() methods in combination with manually implemented url and commit capturing (as recommended in the docs as replacement for the deprecated scm attribute) there is one code path which can result in Conan uploading invalid recipes to a remote. Let's suppose we have the following export method:

def export(self):
    git = Git(self, self.recipe_folder)
    scm_url, scm_commit = git.get_url_and_commit()
    update_conandata(self, {"sources": {"commit": scm_commit, "url": scm_url}})

The get_url_and_commit() method will fail as expected when there are uncommitted changes on the local working copy. It will warn when the local commit is not in the remote (but still return a valid url), which is already problematic. But the actual critical code path is when either the local working copy has no remote set (which can happen in some Gitlab CI merge request pipelines where a separate working copy is created that contains only a single detached ref) or the remote is temporarily unreachable. In both of these cases Conan will silently set the url to the path of the working copy in the local filesystem. This path gets written into the conandata.yml and Conan will happily upload that to the remotes afterwards.

There are several reasons why I think this is a bug:

This was breaking our release builds due to temporary Git remote connection issues, leading to broken releases in the remotes. Here is the hook that I'm currently using to mitigate the problem on our side, but I'd rather not put that into production because it could break any day:

from urllib.parse import urlparse
from unittest.mock import patch
from conan.errors import ConanException
from conan.tools.scm import Git

original_get_url_and_commit = Git.get_url_and_commit

def custom_get_url_and_commit(self):
    url, commit = original_get_url_and_commit(self)
    try:
        url_parsed = urlparse(url)
        if url_parsed.scheme not in ["http", "https", "ssh"]:
            raise ValueError()
        return url, commit
    except ValueError:
        raise ConanException("Invalid URL returned by get_url_and_commit(). This might be caused by a missing remote on the working copy or temporary Git remote connection issues.")

def pre_export(output, conanfile, conanfile_path, reference, *args, **kwargs):
    global patcher
    patcher = patch.object(Git, 'get_url_and_commit', new=custom_get_url_and_commit)
    global mock_method
    mock_method = patcher.start()

I need some advice on how to proceed here without having to mock-patch or creating our own Conan fork.

memsharded commented 3 months ago

Hi @jasal82

Thanks for your report.

I am having a look to the code, and investigating how this could be improved.

In the meantime, I think the hook approach is not bad, but I would handle this differently, to make it more solid. Instead of patching the method in a pre-export hook, I would do a post-export hooks that reads the conandata.yml and checks the recorded url and raise accordingly.

jasal82 commented 3 months ago

Hi @memsharded

that was my first idea but there are two problems with the approach:

jasal82 commented 3 months ago

I think the central question is why a file system path instead of an url is considered a valid return value for get_url_and_commit(). What is the use-case here? Is it to support building a recipe on a local source tree which is not a working copy? IMHO getting a path instead of an url is surprising and unexpected, so I'd rather have the entire use-case handled differently. For example by not running export/source code at all if we're in a completely detached source tree environment. However, due to the fact that the conanfile-in-source-repo case is now handled the exact same way as the conanfile-needs-additional-sources-from-somewhere-else case this would be very difficult to implement.

jasal82 commented 3 months ago

Alternatively there could be a command line flag or conf which makes the build fail if that warning at the end of get_url_and_commit() is generated.

memsharded commented 3 months ago

The main reason for this behavior is to allow the local developer flow without needing to push commits to the server. In order to have both the scm feature running and being able to test and locally create packages without pushing commits to the server, which is a big blocker for practically 100% of users, it is necessary that the scm feature also works seamlessly locally, using the local commits and not the server ones.

I agree that we want to improve the robustness to avoid unexpectedly uploading to the server local paths. I am checking 2 possibilities atm:

Alternatively there could be a command line flag or conf which makes the build fail if that warning at the end of get_url_and_commit() is generated.

This already exists in Conan 2, that has warnings-as-errors configuration. We could even add a new warning tag to fail on this specific warning.

jasal82 commented 3 months ago

I understand the local workflow case, but currently two different root causes are merged into one error state:

  1. Local commit not in remote -> this is the local workflow case, you could still return a valid url because the working copy has a remote set
  2. Working copy has no remote or remote not reachable -> this is different and should not result in the same warning and same error state
memsharded commented 3 months ago

These 2 scenarios are mostly the same for the local developer flow. Many users will do a local git init ., start working on a recipe, etc, so the working copy has no remote yet at all. Still the conan create command should work for them seamlessly in scenario 2.

Scenario 1, the local folder should be used as remote, because this is what the internal source() method will use to get the code.

So the final condition and state in both scenarios seems that it should still be the same: a warning and the scm information pointing to the local clone of the user, not the remote.

I am exploring a hybrid solution in https://github.com/conan-io/conan/pull/16597, raising if trying to upload it, having a conf to either fail fast or allow it.

memsharded commented 2 months ago

Closed by https://github.com/conan-io/conan/pull/16597 for next Conan 2.6 release