openshift / openshift-ansible

Install and config an OpenShift 3.x cluster
https://try.openshift.com
Apache License 2.0
2.18k stars 2.31k forks source link

Proposal for the /utils installer wrapper config format #784

Closed detiber closed 8 years ago

detiber commented 9 years ago

I would suggest that the config format should really be something along the lines of (using some variables that we don't allow setting yet with the quick installer to show why I think it'll be valuable):

version: v1 # might as well conform with the OpenShift/Kubernetes api version naming scheme
deployment:
  variant: openshift-enterprise
  variant_version: 3.1
  ansible_ssh_user: root  #in reality we could even allow this to be set on a per host basis as well
  ansible_become: no      # same as above
  use_openshift_sdn: yes
  sdn_network_plugin_name: redhat/openshift-ovs-subnet
  masters:
    default_subdomain: apps.test.example.com
    master_identity_providers:
    - name: htpasswd_auth
      login: true
      challenge: true
      kind: HTPasswdPasswordIdentityProvider
      filename: /etc/openshift/htpasswd
    hosts:
    - ip: 10.0.0.1
      hostname: master-private.example.com
      public_ip: 24.222.0.1
      public_hostname: master.example.com
      containerized: true
  nodes:
    kubelet_args:
      max_pods:
      - "100"
    storage_plugin_deps:
    - ceph
    - glusterfs
    hosts:
    - ip: 10.0.0.1
      hostname: master-private.example.com
      public_ip: 24.222.0.1
      public_hostname: master.example.com
      schedulable: no
      containerized: true
    - ip: 10.0.0.2
      hostname: node1-private.example.com
      public_ip: 24.222.0.2
      public_hostname: node1.example.com
      node_labels:
        region: primary
        zone: default
    - ip: 10.0.0.3
      hostname: node2-private.example.com
      public_ip: 24.222.0.3
      public_hostname: node2.example.com
      node_labels:
        region: secondary
        zone: default
  etcd:
    initial_cluster_token: etcd-cluster-1
    hosts:
    - ip: 10.0.0.4
      hostname: etcd1-private.example.com
      etcd_interface: eth0

My goal here is to bring the config as close to something that we can easily serialize out to an inventory script (and a set of group/host vars that will allow us to avoid having to json encode variables into a ini file), we should be able to build a very simple inventory file and then serialize the different parameters according to the group or host that they belong.

We should also be able to more easily support features that aren't yet exposed in the cli through the unattended installation this way.

The resulting inventory would be:

[OSEv3:children]
masters
nodes
etcd

[masters]
10.0.0.1

[nodes]
10.0.0.1
10.0.0.2
10.0.0.3

[etcd]
10.0.0.4

We would write the inventory to a directory that also contained a group_vars directory and a host_vars directory.

group_vars/OSEv3.yml(the filenames need to match up with the group names in the inventory):

ansible_ssh_user: root  #in reality we could even allow this to be set on a per host basis as well
ansible_become: no      # same as above
openshift_use_openshift_sdn: yes
openshift_sdn_network_plugin_name: redhat/openshift-ovs-subnet
deployment_type: openshift-enterprise  # computed from variant and variant_version

group_vars/masters.yml:

osm_default_subdomain: apps.test.example.com
openshift_master_identity_providers:
- name: htpasswd_auth
  login: true
  challenge: true
  kind: HTPasswdPasswordIdentityProvider
  filename: /etc/openshift/htpasswd

host_vars/10.0.0.1.yml (the filenames need to match up with the host name used in the inventory)

ip: 10.0.0.1
hostname: master-private.example.com
public_ip: 24.222.0.1
public_hostname: master.example.com
containerized: true
detiber commented 9 years ago

Moving the discussion from https://github.com/openshift/openshift-ansible/pull/751 here

@dgoodwin:

@detiber I just shipped a new config file format over to docs and was aiming for this to be our >"supported" format going forward. Ideally we would not have to change this so we might need to talk >on whether we should jump on setting up to match something close to what you have above before 3.1 >if possible, and then get docs to update. I have some thoughts on adjustments to the above but will >save those for later and see what we think we should do now in scrum.

@detiber:

@dgoodwin I'd definitely like to follow up with you on your thoughts to the above, however I'd say >that we should probably look at these changes for 3.1+, since it would require quite a bit of >refactoring of what is there for very little gain at this point, since the installer does not support any of >the advanced features that would necessitate it.

That said I think we should make sure that we add a version field to the config prior to 3.1, so that >we can make future changes and be able to handle them more gracefully than we have in the past >(we can disable features that weren't present in an older config or we can choose some "sane" >defaults in other cases). This should also give us some wiggle room to make substantive changes >like I proposed above without worrying about breaking existing users.

dgoodwin commented 9 years ago

So from your example above, the only thing that's technically incompatible with the new format for 3.1 is (1) the nesting under a deployment section, and (2) how you specify hosts.

For (1) it doesn't seem to serve a purpose, but maybe this is a standard thing for kube configs. If it's important though, we could consider this part for 3.1.

For (2) I'd actually pitch we keep the current format:

hosts:

Namely because it removes the tedious duplication of data for a master that is also a node. (It's concievable folks will write these by hand and -u the installer)

If we could drop (1) (or do it quickly before 3.1), and agree on (2) then technically the new config format would be compatible with what you have above I think. We could expand on it without invalidating.

detiber commented 9 years ago

So from your example above, the only thing that's technically incompatible with the new format for 3.1 is (1) the nesting under a deployment section, and (2) how you specify hosts.

For (1) it doesn't seem to serve a purpose, but maybe this is a standard thing for kube configs. If it's important though, we could consider this part for 3.1.

I see the nesting as a better way to handle serialization between json and the objects themselves, instead of having to have a translation layer.

Using my example above, I would see a config class containing version and deployment attributes, so serialization would be invoking a to_json method that would serialize the version and then call the to_json method on the deployment object. The deployment object would then serialize the attributes related to the deployment as a whole and then invoke the to_json methods of it's child objects, etc.

The main goal being that each type of object within the deployment can handle it's own serialization, validation, etc. While there will be some inevitable bleed over, both in duplication and in validation, I think in the end it is worth it to manage some of the complexity that will grow.

We could even do some trickery to handle the duplication issue by searching the deployment to fill in the blanks if possible. So after reading the config and generating the objects, if any of the objects fail validation we could inspect the other objects for a matching hostname to provide the missing attributes, so duplicating the hostname would not be necessary as long as it is set on at least one of the definitions of the host.

For (2) I'd actually pitch we keep the current format: hosts:

- hostname: 10.3.9.222
  ip: 10.3.9.222
  master: true
  node: true
  public_hostname: 10.3.9.222
  public_ip: 10.3.9.222
- hostname: 10.3.9.244
  ip: 10.3.9.244
  node: true
  public_hostname: 10.3.9.244
  public_ip: 10.3.9.244

I could see us possibly leveraging module mixins instead of having a strict class hierarchy, and we could blend these two approaches a bit:

version: v1
deployment:
- hostname: 10.3.9.222 # attribute of Host class
  ip: 10.3.9.222 # attribute of Host class
  master: true # setting this to true would import the OpenShift host and OpenShift master mixins
  node: true # setting this to true would import the OpenShift host and OpenShift node mixins
  public_hostname: 10.3.9.222 # attribute of OpenShift host mixin
  public_ip: 10.3.9.222 # attribute of OpenShift host mixin

That said, I don't have experience with trying to use mixins for this type of use case (dynamically importing), or how we could really handle the serialization cleanly using mixins.

smunilla commented 8 years ago

Idea based off our meeting:

version: v1 # might as well conform with the OpenShift/Kubernetes api version naming scheme
deployment:
  variant: openshift-enterprise
  variant_version: 3.1
  ansible_ssh_user: root  #in reality we could even allow this to be set on a per host basis as well
  ansible_become: no      # same as above
  use_openshift_sdn: yes
  sdn_network_plugin_name: redhat/openshift-ovs-subnet
  hosts:
    - ip: 10.0.0.1
      hostname: master-private.example.com
      public_ip: 24.222.0.1
      public_hostname: master.example.com
      containerized: true
      schedulable: no
      roles:
      - master
      - node
    - ip: 10.0.0.2
      hostname: node1-private.example.com
      public_ip: 24.222.0.2
      public_hostname: node1.example.com
      node_labels:
        region: primary
        zone: default
      roles:
      - node
    - ip: 10.0.0.3
      hostname: node2-private.example.com
      public_ip: 24.222.0.3
      public_hostname: node2.example.com
      node_labels:
        region: secondary
        zone: default
      roles:
      - node
    - ip: 10.0.0.4
      hostname: etcd1-private.example.com
      etcd_interface: eth0 
      roles:
      - etd     
  roles:    
    master:
      default_subdomain: apps.test.example.com
      master_identity_providers:
      - name: htpasswd_auth
        login: true
        challenge: true
        kind: HTPasswdPasswordIdentityProvider
        filename: /etc/openshift/htpasswd
    node:
      kubelet_args:
        max_pods:
        - "100"
      storage_plugin_deps:
      - ceph
      - glusterfs
    etcd:
      initial_cluster_token: etcd-cluster-1
detiber commented 8 years ago

@smunilla only critique I have with your suggestion is that I would like to see roles be under deployment instead of at the same level.

Also, I think we'll need to make sure that the roles entries aren't necessarily required (that way if a role is mentioned for a host, and doesn't have an entry under roles we don't error out).

smunilla commented 8 years ago

@detiber I definitely meant to put roles under deployment. Copy-paste fail on my part.

And I agree. We should have sensible defaults for all the roles and the user can overwrite or add additional stuff only if they want to.

akostadinov commented 8 years ago

I think we need heavier host configuration. e.g. all options for ansible to connect the host - ssh_user, ssh_key/password, etc. Also make sure host configuration can be extended later to allow launching hosts off cloud services directly by ansible. Another thing to think about is adding the option to create new DNS names for the launched machines as some clouds do not have proper DNS OOB (e.g. some openstack instances). Also generating DNS name for the routes is highly desirable. I'm not telling to implement these right away. Just to give a thought how these can be integrated into the config.

For hosts I'd say something like:

hosts:
- num: 2
  IaaS: AWS
  create_opts: <mapping of IaaS API specific options to create instance>
  roles: [ "some", "roles" ]
  rewrite_dns: ???
  instance_name_prefix: my_instance_name_prefix_ # we append num after prefix

We'll also need a custom config section for setting common IaaS options like auth credentials, etc.

detiber commented 8 years ago

@akostadinov the current proposed format can be found here: https://gist.github.com/detiber/da042623b26522fcd5767825eafe97a0

The PR to implement it is here: https://github.com/openshift/openshift-ansible/pull/1778

I think we need heavier host configuration. e.g. all options for ansible to connect the host - ssh_user, ssh_key/password, etc.

SSH user is already able to be specified.

I do agree that SSH key and Password should be configurable as well, but that would be a separate feature we would need to add.

Also make sure host configuration can be extended later to allow launching hosts off cloud services directly by ansible.

My thoughts here are that we can add key to deployments for specifying a list of instance providers (I don't want to say cloud provider, since it could also include things like libvirt, RHEV, Cloud Forms, etc).

Another thing to think about is adding the option to create new DNS names for the launched machines as some clouds do not have proper DNS OOB (e.g. some openstack instances). Also generating DNS name for the routes is highly desirable. I'm not telling to implement these right away. Just to give a thought how these can be integrated into the config.

Yes, we are working on addressing this. We already have a DNS role that we are building upon to provide this for the ManageIQ integration.

For hosts I'd say something like:

hosts:
- num: 2
  IaaS: AWS
  create_opts: <mapping of IaaS API specific options to create instance>
  roles: [ "some", "roles" ]
  rewrite_dns: ???
  instance_name_prefix: my_instance_name_prefix_ # we append num after prefix

I'm not sure that I'm a fan of overloading hosts in this way. We are still a ways off from implementing direct cloud provider integration for spinning up hosts, but in general we need to specify the following types of items:

After typing that out, I kind of see deployment level cloud provider info sitting under the Deployment key and the role based info sitting under roles.

detiber commented 8 years ago

@smunilla @abutcher: your thoughts on the above?

akostadinov commented 8 years ago

@detiber , I was also thinking about putting some config under roles. But often machines will have multiple roles. Then handling merging of IaaS options and order of precedence will be complicated. Also deployment options like VPC, security group, etc. are highly IaaS provider specific. In my installer I ended up mapping YAML as much as possible to the underlying IaaS API so that code only translates the YAML into proper structure for the API calls. That's why I think having IaaS (or non-IaaS - libvirt, etc.) create instance options in one place independent on machine roles. Just my two cents. I'm not insisting on anything. Just I came to these conclusions to allow:

  1. full flexibility of configuration
  2. avoid maintaining abstraction over different IaaS providers
  3. make it reasonably easy to figure out what options one can set by reading the IaaS API

wrt DNS, I think somebody asked about the DNS role and answer was that it is something else, not about creating DNS records for the routes and nodes. My suggestions were made after I looked at the gist.

sdodson commented 8 years ago

Proposal has been implemented, minor fixups ongoing.