Closed jperville closed 6 years ago
Alright :-) Sounds like a plan
Let's see if we can introduce an upgrade Adhoc recipe
@IshentRas I have submitted a draft PR for this feature. Looking forward to testing the adhoc upgrade recipe.
@jperville Ok next step is metrics/logging + new customisation items we shall offer to end user.
I am using 1.5 on my learning environment. Could you please share which futures are not safe to use ( as i understand metrics/logging are those ) ?
Hello @ebenezar-mccoy , I have an 1.5.x cluster in production. As you said, the instructions for logging and metrics do not work anymore (because the official documentation says "use ansible playbook"). I'd like to see those ported to this cookbook one day, but you can still try to use the v1.4.1 images on top of openshift 1.5.x cluster, it should work (see https://docs.openshift.com/container-platform/3.4/install_config/aggregate_logging.html for documentation).
Also, beware of using prebuilt docker images from docker versions >= 1.10, the internal manifest format has changed, making impossible to reference the layers by sha256sum (that's not a problem with this cookbook but a gotcha that should be known by every openshift user).
Is the last item still to-do?
Can close, v1.5 is working in production for me.
@jperville, i'm a bit confused, logging and metrics are still on todo list for 1.5 ? or we are safe to use images from 1.5 ?
Hello @ebenezar-mccoy ; the logging addon has never been supported by this cookbook (have a look at https://github.com/IshentRas/cookbook-openshift3/tree/master/resources ), and the metrics addon is supported if you supply all the parameters, so this cookbook is working as well with openshift v1.5.1 as it did work with openshift v1.4.x.
I was just reporting my experience with my own LWRP to deploy the logging component and I said that the instructions for v1.5.x are no longer available on the official documentation, they just say "use ansible", which means that if we were to support deploying the logging component using this cookbook (which is not possible today) we would have to backport the logic from the ansible cookbook instead of applying the documented steps, which used to work until v1.4.x included.
Sorry @jperville @ebenezar-mccoy for the delay replying to your post. Currently working on it. Should be with us shortly :+1:
@jperville the metrics is ready :+1: Will send a PR shortly
@jperville @ianmiell Will need to refactor the tests for metrics as the way of deploying has changed
I am trying to test those changes to launch metrics in my test environment: when pod hawkular metrics trying to start it's getting following error: Events:
Error syncing pod, skipping: failed to "StartContainer" for "hawkular-metrics" with CrashLoopBackOff: "Back-off 20s restarting failed container=hawkular-metrics pod=hawkular-metrics-rk07n_openshift-infra(418770f1-615c-11e7-b5a5-005056b79c1f)"
Logs:
2017-07-05 08:34:07 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra
my configuration:
"openshift_hosted_cluster_metrics": true,
"openshift_metrics_image_version": "v1.5.1",
"openshift_common_default_nodeSelector": "region=infra"
I am unable to reproduce what you can see... Can you share the output of the following commands, please? 1) oc get clusterrolebinding hawkular-view 2) oc get clusterrolebinding | grep /view
William
JBoss Bootstrap Environment .....
Sure, here it is:
$ oc get clusterrolebinding hawkular-view
NAME ROLE USERS GROUPS SERVICE ACCOUNTS SUBJECTS
hawkular-view /view openshift-infra/hawkular
$ oc get clusterrolebinding | grep /view
hawkular-view /view openshift-infra/hawkular
$ oc logs -f hawkular-metrics-stg62
2017-07-06 19:44:47 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra
@ebenezar-mccoy Ok I have found a few bugs, I am about to fix. Will keep you posted as soon as I push a new version. (Likely to be today in a bit) The rolebinding for hawkular-metrics was not correctly applied
@ebenezar-mccoy Can you test again against latest code please 1.10.58
Hi @IshentRas it's still not working for me on 1.10.58, a different output from commands:
[root@oc01 ~]# oc logs -f hawkular-metrics-mv0xt
2017-07-14 07:36:22 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra
[root@oc01 ~]# oc get clusterrolebinding hawkular-view
Error from server (NotFound): clusterrolebinding "hawkular-view" not found
[root@oc01 ~]# oc get clusterrolebinding | grep /view
[root@oc01 ~]#
My test configuration: https://github.com/ebenezar-mccoy/chef_openshift_test
We have fixed the code to deploy the rolebinding locally instead of cluster wide.
Therefore you should get the following (Within the openshift-infra):
[root@srv-101 ~]# oc get rolebinding -n openshift-infra NAME ROLE USERS GROUPS SERVICE ACCOUNTS SUBJECTS
hawkular-view /view hawkular
system:deployer /system:deployer deployer
system:image-builder /system:image-builder builder
system:image-puller /system:image-puller system:serviceaccounts:openshift-infra
I have redeployed everything (Crash and burn) And all look fine for me... I am a bit lost on how it is not working for you...
[root@srv-101 ~]# oc get pod NAME READY STATUS RESTARTS AGE hawkular-cassandra-1-zgb1q 1/1 Running 0 5m hawkular-metrics-6n418 1/1 Running 0 5m heapster-xqf6x 1/1 Running 0 5m
[root@srv-101 ~]# oc logs hawkular-metrics-6n418 | head -n5 2017-07-14 10:05:26 Starting Hawkular Metrics The service account has read permissions for its project. Proceeding /opt/hawkular/auth /opt/jboss Certificate was added to keystore [Storing hawkular-metrics.truststore]
[root@oc01 ~]# oc get rolebinding -n openshift-infra
NAME ROLE USERS GROUPS SERVICE ACCOUNTS SUBJECTS
admin /admin admin
hawkular-view /view hawkular
system:deployer /system:deployer deployer, deployer
system:image-builder /system:image-builder builder, builder
system:image-puller /system:image-puller system:serviceaccounts:openshift-infra
[root@oc01 ~]# oc get pod
NAME READY STATUS RESTARTS AGE
hawkular-cassandra-1-bjjlq 1/1 Running 0 3h
hawkular-metrics-mv0xt 0/1 CrashLoopBackOff 41 3h
heapster-hjbsl 0/1 Running 20 3h
heapster is wating for metrics
[root@oc01 ~]# oc logs hawkular-metrics-mv0xt
2017-07-14 10:40:57 Starting Hawkular Metrics
Error: the service account for Hawkular Metrics does not have permission to view resources in this namespace. View permissions are required for Hawkular Metrics to function properly.
Usually this can be resolved by running: oc adm policy add-role-to-user view system:serviceaccount:openshift-infra:hawkular -n openshift-infra
Redeploying this enviroment from scratch is in my routine.
@IshentRas, do you have an idea how to solve this problem?
@ebenezar-mccoy Contact me on william17.burton@gmail.com
@ebenezar-mccoy Sorry for such a delay. I'd like to thank you for your patience. So your issue is down to a misused of one of the variable :+1: So you use "openshift_metrics_master_url": "metrics.domain.local" and it is wrong as it is to overwrite the URL of the kubernetes listener or masters URL. I believe that what you are trying to achieve is: "openshift_metrics_hawkular_hostname": "metrics.domain.local" which set the FQDN to use with the Hawkular-metrics. I can confirm that switching the options does solve your error. Can you please check :-)
I assume we can close this now?
Now that Openshift Origin v1.5.0 has been released, I plan to work on adding support for the new version in the coming weeks.
Checklist (to be completed):