IshentRas / cookbook-openshift3

Chef Cookbook for Openshift
https://supermarket.chef.io/cookbooks/cookbook-openshift3
MIT License
27 stars 12 forks source link

Support for Origin v1.5.0 #114

Closed jperville closed 6 years ago

jperville commented 7 years ago

Now that Openshift Origin v1.5.0 has been released, I plan to work on adding support for the new version in the coming weeks.

Checklist (to be completed):

IshentRas commented 7 years ago

Alright :-) Sounds like a plan

IshentRas commented 7 years ago

Let's see if we can introduce an upgrade Adhoc recipe

jperville commented 7 years ago

@IshentRas I have submitted a draft PR for this feature. Looking forward to testing the adhoc upgrade recipe.

IshentRas commented 7 years ago

@jperville Ok next step is metrics/logging + new customisation items we shall offer to end user.

ebenezar-mccoy commented 7 years ago

I am using 1.5 on my learning environment. Could you please share which futures are not safe to use ( as i understand metrics/logging are those ) ?

jperville commented 7 years ago

Hello @ebenezar-mccoy , I have an 1.5.x cluster in production. As you said, the instructions for logging and metrics do not work anymore (because the official documentation says "use ansible playbook"). I'd like to see those ported to this cookbook one day, but you can still try to use the v1.4.1 images on top of openshift 1.5.x cluster, it should work (see https://docs.openshift.com/container-platform/3.4/install_config/aggregate_logging.html for documentation).

Also, beware of using prebuilt docker images from docker versions >= 1.10, the internal manifest format has changed, making impossible to reference the layers by sha256sum (that's not a problem with this cookbook but a gotcha that should be known by every openshift user).

ianmiell commented 7 years ago

Is the last item still to-do?

jperville commented 7 years ago

Can close, v1.5 is working in production for me.

ebenezar-mccoy commented 7 years ago

@jperville, i'm a bit confused, logging and metrics are still on todo list for 1.5 ? or we are safe to use images from 1.5 ?

jperville commented 7 years ago

Hello @ebenezar-mccoy ; the logging addon has never been supported by this cookbook (have a look at https://github.com/IshentRas/cookbook-openshift3/tree/master/resources ), and the metrics addon is supported if you supply all the parameters, so this cookbook is working as well with openshift v1.5.1 as it did work with openshift v1.4.x.

I was just reporting my experience with my own LWRP to deploy the logging component and I said that the instructions for v1.5.x are no longer available on the official documentation, they just say "use ansible", which means that if we were to support deploying the logging component using this cookbook (which is not possible today) we would have to backport the logic from the ansible cookbook instead of applying the documented steps, which used to work until v1.4.x included.

IshentRas commented 7 years ago

Sorry @jperville @ebenezar-mccoy for the delay replying to your post. Currently working on it. Should be with us shortly :+1:

IshentRas commented 7 years ago

@jperville the metrics is ready :+1: Will send a PR shortly

IshentRas commented 7 years ago

@ebenezar-mccoy https://github.com/IshentRas/cookbook-openshift3/blob/openshift_hosted_metrics/attributes/metrics.rb#L36

IshentRas commented 7 years ago

@jperville @ianmiell Will need to refactor the tests for metrics as the way of deploying has changed

ebenezar-mccoy commented 7 years ago

I am trying to test those changes to launch metrics in my test environment: when pod hawkular metrics trying to start it's getting following error: Events:

Error syncing pod, skipping: failed to "StartContainer" for "hawkular-metrics" with CrashLoopBackOff: "Back-off 20s restarting failed container=hawkular-metrics pod=hawkular-metrics-rk07n_openshift-infra(418770f1-615c-11e7-b5a5-005056b79c1f)"

Logs:

2017-07-05 08:34:07 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra

my configuration:

      "openshift_hosted_cluster_metrics": true,
      "openshift_metrics_image_version": "v1.5.1",
      "openshift_common_default_nodeSelector": "region=infra"
IshentRas commented 7 years ago

I am unable to reproduce what you can see... Can you share the output of the following commands, please? 1) oc get clusterrolebinding hawkular-view 2) oc get clusterrolebinding | grep /view

William

IshentRas commented 7 years ago

Output from my test: [root@srv-101 ~]# oc logs -f hawkular-metrics-hpqf4 2017-07-06 17:04:05 Starting Hawkular Metrics The service account has read permissions for its project. Proceeding /opt/hawkular/auth /opt/jboss Certificate was added to keystore [Storing hawkular-metrics.truststore] /opt/jboss

JBoss Bootstrap Environment .....

ebenezar-mccoy commented 7 years ago

Sure, here it is:

$ oc get clusterrolebinding hawkular-view
NAME            ROLE      USERS     GROUPS    SERVICE ACCOUNTS           SUBJECTS
hawkular-view   /view                         openshift-infra/hawkular
$ oc get clusterrolebinding | grep /view
hawkular-view                                   /view                                                                                                           openshift-infra/hawkular
$ oc logs -f hawkular-metrics-stg62
2017-07-06 19:44:47 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra
IshentRas commented 7 years ago

@ebenezar-mccoy Ok I have found a few bugs, I am about to fix. Will keep you posted as soon as I push a new version. (Likely to be today in a bit) The rolebinding for hawkular-metrics was not correctly applied

IshentRas commented 7 years ago

142

IshentRas commented 7 years ago

@ebenezar-mccoy Can you test again against latest code please 1.10.58

ebenezar-mccoy commented 7 years ago

Hi @IshentRas it's still not working for me on 1.10.58, a different output from commands:

[root@oc01 ~]# oc logs -f hawkular-metrics-mv0xt
2017-07-14 07:36:22 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra
[root@oc01 ~]# oc get clusterrolebinding hawkular-view
Error from server (NotFound): clusterrolebinding "hawkular-view" not found
[root@oc01 ~]# oc get clusterrolebinding | grep /view
[root@oc01 ~]#

My test configuration: https://github.com/ebenezar-mccoy/chef_openshift_test

IshentRas commented 7 years ago

We have fixed the code to deploy the rolebinding locally instead of cluster wide. Therefore you should get the following (Within the openshift-infra): [root@srv-101 ~]# oc get rolebinding -n openshift-infra NAME ROLE USERS GROUPS SERVICE ACCOUNTS SUBJECTS hawkular-view /view hawkular
system:deployer /system:deployer deployer
system:image-builder /system:image-builder builder
system:image-puller /system:image-puller system:serviceaccounts:openshift-infra

I have redeployed everything (Crash and burn) And all look fine for me... I am a bit lost on how it is not working for you...

[root@srv-101 ~]# oc get pod NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-zgb1q 1/1 Running 0 5m hawkular-metrics-6n418 1/1 Running 0 5m heapster-xqf6x 1/1 Running 0 5m

[root@srv-101 ~]# oc logs hawkular-metrics-6n418 | head -n5 2017-07-14 10:05:26 Starting Hawkular Metrics The service account has read permissions for its project. Proceeding /opt/hawkular/auth /opt/jboss Certificate was added to keystore [Storing hawkular-metrics.truststore]

ebenezar-mccoy commented 7 years ago
[root@oc01 ~]# oc get rolebinding -n openshift-infra
NAME                   ROLE                    USERS     GROUPS                                   SERVICE ACCOUNTS     SUBJECTS
admin                  /admin                  admin
hawkular-view          /view                                                                      hawkular
system:deployer        /system:deployer                                                           deployer, deployer
system:image-builder   /system:image-builder                                                      builder, builder
system:image-puller    /system:image-puller              system:serviceaccounts:openshift-infra
[root@oc01 ~]#  oc get pod
NAME                         READY     STATUS             RESTARTS   AGE
hawkular-cassandra-1-bjjlq   1/1       Running            0          3h
hawkular-metrics-mv0xt       0/1       CrashLoopBackOff   41         3h
heapster-hjbsl               0/1       Running            20         3h

heapster is wating for metrics

[root@oc01 ~]# oc logs hawkular-metrics-mv0xt
2017-07-14 10:40:57 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra

Redeploying this enviroment from scratch is in my routine.

ebenezar-mccoy commented 7 years ago

@IshentRas, do you have an idea how to solve this problem?

IshentRas commented 7 years ago

@ebenezar-mccoy Contact me on william17.burton@gmail.com

IshentRas commented 7 years ago

@ebenezar-mccoy Sorry for such a delay. I'd like to thank you for your patience. So your issue is down to a misused of one of the variable :+1: So you use "openshift_metrics_master_url": "metrics.domain.local" and it is wrong as it is to overwrite the URL of the kubernetes listener or masters URL. I believe that what you are trying to achieve is: "openshift_metrics_hawkular_hostname": "metrics.domain.local" which set the FQDN to use with the Hawkular-metrics. I can confirm that switching the options does solve your error. Can you please check :-)

ianmiell commented 6 years ago

I assume we can close this now?