GoogleCloudPlatform / kubeflow-distribution

Blueprints for Deploying Kubeflow on Google Cloud Platform and Anthos
Apache License 2.0
78 stars 63 forks source link

Periodically run KF ready tests against auto deployments #52

Open jlewi opened 4 years ago

jlewi commented 4 years ago

Follow on to #42

We should setup a periodic test that runs the tests that the kubeflow applications were correctly deployed.

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
area/engprod 0.74
kind/feature 0.87

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

jlewi commented 4 years ago

Getting a weird kubernetes client issue when running the tests.

self = <kubernetes.config.kube_config.KubeConfigLoader object at 0x7f12cc3197c0>

    def _refresh_gcp_token(self):
        if 'config' not in self._user['auth-provider']:
            self._user['auth-provider'].value['config'] = {}
        provider = self._user['auth-provider']['config']
        credentials = self._get_google_credentials()
        provider.value['access-token'] = credentials.token
        provider.value['expiry'] = format_rfc3339(credentials.expiry)
        if self._config_persister:
>           self._config_persister()
E           TypeError: _save_kube_config() missing 1 required positional argument: 'config_map'

/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:363: TypeError
jlewi commented 4 years ago

Here's the full stacktrace

kf_is_ready_test.py:70: in check_deployments_ready
    util.load_kube_config()
/srcCache/kubeflow/testing/py/kubeflow/testing/util.py:814: in load_kube_config
    loader.load_and_set(config) # pylint: disable=too-many-function-args
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:550: in load_and_set
    self._load_authentication()
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:278: in _load_authentication
    if self._load_auth_provider_token():
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:293: in _load_auth_provider_token
    return self._load_gcp_token(provider)
/usr/local/lib/python3.8/dist-packages/kubernetes/config/kube_config.py:350: in _load_gcp_token
    self._refresh_gcp_token()

It looks like the version of kubernetes lib in the container is "11.0.0" locally I have "9.0.0".

issue-label-bot[bot] commented 4 years ago

Issue-Label Bot is automatically applying the labels:

Label Probability
platform/gcp 0.73

Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! Links: app homepage, dashboard and code for this bot.

jlewi commented 4 years ago

Dashboard: https://k8s-testgrid.appspot.com/sig-big-data#kubeflow-gcp-blueprints-master-periodic

Tests are running regularly but some of the tests are failing.

jlewi commented 4 years ago

@Bobgy The test is passing can we close this? https://k8s-testgrid.appspot.com/sig-big-data#kubeflow-gcp-blueprints-master-periodic&group-by-hierarchy-pattern=%5B%5Cw-%5D%2B

Bobgy commented 4 years ago

looks like kf_is_ready and metadata_is_ready tests are not passing yet, did you see them?

Bobgy commented 4 years ago

Or is this issue just for setting up the test, then I have no objection

jtfogarty commented 4 years ago

/area gcp-blueprints /priority p2

jtfogarty commented 4 years ago

/priority p2