kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.6k stars 1.62k forks source link

[sdk] --trusted-host Automatically added to pip command in Kubeflow Pipelines #11155

Open diegolovison opened 2 months ago

diegolovison commented 2 months ago

Description: The Kubeflow Pipelines component is currently adding the --trusted-host option to the pip command by default. This occurs because the value is being copied directly from the pip_trusted_hosts configuration.

Security Concern: Using the --trusted-host option disables SSL certificate validation for the specified host, which can expose the system to significant security risks. Specifically, it makes the environment vulnerable to man-in-the-middle (MITM) attacks, where an attacker could intercept and potentially alter the packages being installed. This is particularly concerning in environments that require strict security controls, such as airgapped or production systems.

Expected Behavior: The --trusted-host option should not be automatically added to the pip command unless explicitly configured by the user. The default behavior should enforce SSL certificate validation to ensure secure package installations.

Environment

Steps to reproduce

@dsl.component(base_image=common_base_image,
               pip_index_urls=['https://myurl.org/simple'])
def flip_coin() -> str:
    """Flip a coin and output heads or tails randomly."""
    import random

    result = "heads" if random.randint(0, 1) == 0 else "tails"
    print(result)
    return result

Expected result

The output was formated and it is generated by the SDK

if ! [ -x "$(command -v pip)" ]; then
    python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip
fi

PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location \
    --index-url https://myurl.org/simple 'kfp==2.1.3' '--no-deps' \
    'typing-extensions>=3.7.4,<5; python_version<"3.9"' && \
    python3 -m pip install --quiet --no-warn-script-location \
    --index-url https://myurl.org/simple 'package1' 'package2' && \
    "$0" "$@"

Materials and Reference


Impacted by this bug? Give it a 👍.

HumairAK commented 2 months ago

The --trusted-host option should not be automatically added to the pip command unless explicitly configured by the user. The default behavior should enforce SSL certificate validation to ensure secure package installations.

100% for this. While this PR: https://github.com/kubeflow/pipelines/pull/11151 at least enables the user to opt-in for the secure option. It is unreasonable to require a user, for each component, enable pip installs in a secure way. We should not be adding extra overhead simply to do something securely.

Given the security concerns here, my preference is that we enable no --trusted-host flag by default, even if it can be a breaking change. My concern is most users do not realize that their pipelines might be doing pip installs insecurely.

If maintainers are adamant we do not break backwards compatibility, my alternative suggestion is to do something similar to this caching PR, whereby we allow changing the default behavior via some global cli flag or env var.

@chensun / @zijianjoy wdyt?

github-actions[bot] commented 2 days ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.