rocky-linux / infrastructure

The infrastructure monorepo for the Rocky Linux project. This project will be archived/deprecated in the future.
https://rockylinux.org
385 stars 46 forks source link

Internal Naming Scheme? #7

Open Gorian opened 3 years ago

Gorian commented 3 years ago

suggestion

Slack Link

Naming Scheme

{identifier - 2 characters}{location code - 2 characters}{system type - 1 character}{server type - 1 character}{environment - 1 character}{function - 4 characters}{instance number - 3 characters}[s for service processor)
identifier:
    group / corp / etc
location code:
    where the servers are
System Type:
    s = server
    w = workstation
    n = network gear. Switch, etc.
Server Type:
    p = production
    s = staging
    t = testing
    d = development
environment:
    p = physical
    v = virtual
function:
    defines the function of the server.
    ad = Active Directory
    ns = Name Server
    edge = network edge, firewall, router, etc.
    osn = OpenStack Node
instance number:
    defines the instance number of the server, used if a service has more than
    one host or node. starts at 001, always pad with zeros.
service processor - drac, ILO, etc.

example

rlnaspvweb001.in.rockylinux.org

rl = rocky linux
na = north america
s = server
p = production
v = virtual
web = http host
001 = first instance

in = internal domain

just a starting point. 15 characters to be NetBIOS compatible, not required. The identifier field can be removed and replaced with a 4-character location field for example.

Rorasaurus commented 3 years ago

Looks good. Achieves the goal of telling you exactly what the server does and where it is, without claiming too much space in your shell.

nazunalika commented 3 years ago

I like the idea. Would we name "IPA" just 'id' or 'im' for identity management? Since of course IPA isn't AD.

Gorian commented 3 years ago

I like the idea. Would we name "IPA" just 'id' or 'im' for identity management? Since of course IPA isn't AD.

To be honest, I mostly copied and pasted from a proposal I made for a role I was on a few years ago - I figure we can define whatever we want to fit it - LDAP, IPA, SSO, or whatever else. As mentioned on the call, we probably don't need the group/org field, so we could easily scrap it to free up characters for other uses and still stay under 15 characters

Gorian commented 3 years ago

Also just pointing out that DNS aliases are a thing - best practice would be to set the canonical hostnames to a defined naming scheme and then use CNAMES / aliases for "friendly" names, like if you wanted build-server.in.rockylinux.org, that can easily just be an alias to a host that follows the naming scheme

nicosalvadore commented 3 years ago

Using IATA codes for location is pretty standard. I think we can drop the rl and use IATA code + a number to define the location. Or it's also frequent to directly insert the location or region as a subdomain. The size of the desired/needed infrastructure is what should drive this decision I think.

NeilHanlon commented 3 years ago

I agree with @nicosalvadore and would like to use iata codes as they line up with things that show up in, for example, traceroutes.

There is some.. err.. Pushback to using lots of subdomains so i'm ok with not doing that.

Nickster2013 commented 3 years ago

Definitely in favor of not having excessive subdomains. However, in the above naming convention, do we really want prod/staging/testing/dev identifiers in the hostname? Or would it be better to extract that out as a subdomain?

My thought here is in regards to automated deployment of new hosts. If we have the env as part of the individual hostname, infrastructure scripts have to be aware of which env they are running against in order to appropriately name them. Whereas abstracting out the env better allows the same infrastructure code (differentiated solely by branch) to be run against any environment resulting in identical environment structures.

Nickster2013 commented 3 years ago

I would also make the same arguement about IATA DCC.

Adapting @Gorian's suggestion, it might look something like this: {identifier - 2 characters}{system type - 1 character}{environment - 1 character}{function - 4 characters}{instance number - 3 characters}.{IATA DCC}.{prod|staging|testing|dev}.rockylinux.org

rlsvweb001.sfo.prod.rockylinux.org

rl = rocky linux
s = server
v = virtual
web = http host
001 = first instance

lax = Los Angeles
prod = production
schlitzered commented 3 years ago

i do not agree with the above naming scheme.

first of why "001" for counting instances, why not simply start with 1?; when you hit 9, you simply continue with 10.

having a fixed number of digits for counting cluster members, introduces a unneeded barrier (what if, for whatever reason, we hit 999)

also, environment/stage and location/region should each be put into a dedicated subdomain.

without subdomains it would not be possible that FreeIPA clients can use DNS service iscovery to find the nearest FreeIPA server.

generally i would suggest a naming schema like this:

$host.$env.$region.example.com

example: ipa-1.prod.us-east-1.example.com.

this scheme will allow you to send IdM clients to the closted IdM server using service discovery.

and i speak from experience, you do not want an client hosted in europe, to talk to a IdM server in the US, the latency kills performance.

i would also suggest to have a dedicated subdomain to host internal stuff. this makes it easier to keep public resources, and internal resources separated. sure you can use DNS views, but managing this can be pain. da dedicated internal subdomain is way easier to handel:

so the example above could become: ipa-1.prod.us-east-1.infra.example.com.

NeilHanlon commented 3 years ago

Many of these issues have been addressed on the forums, addressed inline.

i do not agree with the above naming scheme.

first of why "001" for counting instances, why not simply start with 1?; when you hit 9, you simply continue with 10.

The arity has been decided on. The reason you do not do this is because you want your hostnames to be deterministic. Rolling over from 9 to 10 (1 byte to 2 bytes) will undoubtedly break all our toolings. I've been through this. I don't want to go through this.

having a fixed number of digits for counting cluster members, introduces a unneeded barrier (what if, for whatever reason, we hit 999)

I'm much less concerned about hitting 1000 of any one type of thing. We're building an OS. I doubt we will have over 100 of any thing, so realistically an arity of 2 would be okay. (e.g. 01 through 99), but when we have the bytes, let's use them. I've even had this argument of encoding in hex instead of decimal.. They all go the same place. Regardless, padding numbers is good practice.

also, environment/stage and location/region should each be put into a dedicated subdomain.

I'm on the fence on this. Personally I like having subdomains, even if we automate the creation of shorter CNAMEs which point to them. However I mostly defer to our SME on IPA, @nazunalika. Ultimately I think we will need at least one subdomain to encode location information to make the hostnames readable. I'd like to hear more opinions on this.

without subdomains it would not be possible that FreeIPA clients can use DNS service iscovery to find the nearest FreeIPA server.

generally i would suggest a naming schema like this:

$host.$env.$region.example.com

example: ipa-1.prod.us-east-1.example.com.

this scheme will allow you to send IdM clients to the closted IdM server using service discovery.

and i speak from experience, you do not want an client hosted in europe, to talk to a IdM server in the US, the latency kills performance.

Like most things in computer science and networking, there is more than one way to accomplish the same goal. I trust @nazunalika 's ability to set up our IPA properly and make this work. We will document what is done and how it's setup. Just a matter of that.

i would also suggest to have a dedicated subdomain to host internal stuff. this makes it easier to keep public resources, and internal resources separated. sure you can use DNS views, but managing this can be pain. da dedicated internal subdomain is way easier to handel:

so the example above could become: ipa-1.prod.us-east-1.infra.example.com.

Again, I'm on the fence on this, but only because I come from managing AD and not IPA, where this is best practice. It appears that the IPA documentation says that putting IPA on a subdomain is strictly not best practice for IPA, and so I think we must go with best practices. Like other things, can it work in other configs? Certainly.. If other people feel strongly about this, please lodge your comments here.

danieltharp commented 3 years ago

I'm out of gas to discuss this tonight but hope to collect my thoughts on sustainable naming for hosts, FQDNs and CNAMEs tomorrow. Writing this as a note-to-self to come back to it in the AM.

danieltharp commented 3 years ago

Right, forgot about this.

I indicated in infra slack that 2 characters isn't enough for a viable location code, and I also don't see significant value in redundant rl at the beginning of every device in our network. That gives us four characters back against a suggested limit of 15. I also don't think the type of device needs its own character if we're also including function space. So that's 5 back.

Assume 3 will be for numbering at all times, 1 for metal (so as not to be confused with production if you went with physical) or virtual or out of band management instead of an alternate reserved character, and 1 of d|t|s|p for dev, test, staging, or prod environment. That would leave 10 characters for describing what the device does.

The order of granularity here would then be: [function][d|t|s|p][m|v|o][NNN]

e.g., freeipapm001, cachetv002, mailsm001, kojidv004, freeipapo001

Granularity for scale and blast radius will also be necessary in the FQDN to accurately assess issues quickly. I would propose:

[hostname].[function].[stage].[region]-[subregion][-subregion number if applicable][-availability zone if applicable].[provider].rockylinux.org

In instances where there is neither a subregion number or availability zone, the subregion should then be in the form of the closest major IATA designation in lowercase.

Examples:

The strategy here is that you can readily identify the common cause of an outage by working logically through the left-to-right order of the fully qualified domain name.

Darkbat91 commented 3 years ago

I am completely in concurrence with the naming scheme and love the thought of how that is broken out. My only question would be why the duplication in the [d|t|s|p] and the kovi.dev sections? the .dev breaks it out for dns aware applications but where is the value in the duplication of that value?

Again agree and this is well described just curious on the one value.

danieltharp commented 3 years ago

Basically so it's easier to see issues are only affecting *.koji.dev.* than having to grok the hostnames in an outage.

codrcodz commented 3 years ago

@pxdnbluesoul, I think @Darkbat91 was asking why its needed in two places. To answer that: a host typically only displays everything to the left of the first dot on its prompt (aka $PS1) by default. By putting an environment indicator there, you reduce the risk of someone running commands in the wrong environment. The other way I've seen this same thing accomplished is adjusting $PS1 on all the server to color code the $PS1 based on environment. This works so long as no one blind or color-blind is managing your infrastructure.

codrcodz commented 3 years ago

@pxdnbluesoul, One more thing I would consider before finalizing this is whether or not you plan on doing A/B deployments in the future. It can be rather difficult to rehostname servers if they are clustered or applications installed on them are sensitive to hostnames changes.

As a result, its a good idea to figure out in advance how you plan to handle A/B deployment hostnaming if you think you will use it at any point. Since A is sometimes prod while B is staging and visa versa, the p/s naming schema breaks down pretty quickly.

If you don't plan on using A/B deployments though... this schema should work just fine.

danieltharp commented 3 years ago

@pxdnbluesoul, I think @Darkbat91 was asking why its needed in two places. To answer that: a host typically only displays everything to the left of the first dot on its prompt (aka $PS1) by default. By putting an environment indicator there, you reduce the risk of someone running commands in the wrong environment. The other way I've seen this same thing accomplished is adjusting $PS1 on all the server to color code the $PS1 based on environment. This works so long as no one blind or color-blind is managing your infrastructure.

This was well put, thank you.

Fair point as well re: A/B. I generally deploy with canaries rather than blue/green so I wasn't thinking in those terms. Definitely food for thought, thank you again @codrcodz.

Edit to ask: Would it be reasonable to add CNAMEs to hosts when such testing or deploying is going on and have the above be the canonical name?

NeilHanlon commented 3 years ago

I think if we did anything I'd want to do canary testing. So I'm OK with not planning on dealing with It in dns.

Sounds like a future Rocky / us problem

codrcodz commented 3 years ago

Edit to ask: Would it be reasonable to add CNAMEs to hosts when such testing or deploying is going on and have the above be the canonical name?

@pxdnbluesoul, absolutely. In my experience, things go much more smoothly when every host has exactly one A record, and everything else is a CNAME. I think I even mention this in the reference architecture excerpt I posted in the IP schema issue.

aevtech commented 3 years ago

Hello,

Maybe you guys have figured this part out already but this is how DNS is done at our locations. For example:

app - cluster - team - purpose - location - environment - domain

So, if we made some haproxy host for the repos by the infra team, it would look something like this:

haproxy[01-02].c01.inf.repos.nj1.prod.rockylinux.org

This host name will tell us what the host is, it's cluster in case there are multiples. who it belong to, the purpose of the host, where it's located and what environment it's in. We also break everything down in Ansible using inventory files so we don't have to worry which environment what runs on since it is all separated and use AWX UI to deploy the Ansible code.