BCDevOps / devops-requests

This repository is used to track the DevOps requests for platform services team.
18 stars 10 forks source link

Please join our Ministry of Education namespaces (c2mvws & mvubjx) to allow for other protocols instead of HTTPS #328

Closed arcshiftsolutions closed 4 years ago

arcshiftsolutions commented 4 years ago

Hello,

The Ministry of Education team has created the following documentation supporting our requirement to have these namespaces joined:

https://github.com/bcgov/EDUC-PEN-REQUEST-SAGA-API/blob/master/rfc/namespaces-joined-rfc.md

We modelled our request from the Registries team which had/has the same requirements: https://github.com/bcgov/entity/blob/master/rfcs/rfc-registry-namespace-connectivity.md

Please let us know if you would like to discuss or would like further detail.

Thanks!

caggles commented 4 years ago

Hello @arcshiftsolutions - please let me know where you got the link that brought you to this request template before we begin!

arcshiftsolutions commented 4 years ago

Hi @caggles - I couldn't find any template to fit, so i started one of the other templates and modified the URL to just create a regular issue with no type. :)

Here is the URL I used: https://github.com/BCDevOps/devops-requests/issues/new

caggles commented 4 years ago

Gotcha! In the future, please be careful about modifying the type, because you removed the assignment to me and shelly as well. I just wouldn't want us to miss a ticket :) Instead, I would recommend using the "New Request Type" ticket for these sorts of requests - this also permits us to consider adding a specific request type for the request in question, if appropriate!

Anyway, could you please clarify what you mean by wanting to "join" the two namespaces? I'm not sure what end result you're looking for!

arcshiftsolutions commented 4 years ago

Hi @caggles - will do regarding tickets going forward.

In terms of the namespaces being joined, we have this requirement since we use NATS as our central guaranteed messaging solution. NATS relies on TCP connections exclusively (no HTTP). We also have the same requirement to connect to our Patroni instance across Education namespaces.

I've provided specific detail at this link:

https://github.com/bcgov/EDUC-PEN-REQUEST-SAGA-API/blob/master/rfc/namespaces-joined-rfc.md

The diagram really highlights why we need the environments joined:

https://raw.githubusercontent.com/bcgov/EDUC-PEN-REQUEST-SAGA-API/master/rfc/Saga.png

What we're looking for is to allow service calls between c2mvws & mvubjx. We explored this on the RocketChat channels with several other teams, in the end we felt the Registries model of joining namespaces is what we need (they were looking to accomplish the same thing).

This link provides detail on their RFC:

https://github.com/bcgov/entity/blob/master/rfcs/rfc-registry-namespace-connectivity.md

Please let me know if you would like further detail.

Thanks!

jefkel commented 4 years ago

@arcshiftsolutions your NATS backend service designs look great (the way you're fronting all external communication through API's)

My initial quick look at your service diagram shows that the only non-http connection requirements would be solved by shifting the PEN Request SAGA API to the same namespace that it's database is running in. This would also seem to solve the NATS communication barrier as well.

All other interface points in your designs are through well designed HTTP(s) API's and interfaces and I'm not sure how they'd directly benefit from a joined network design.

arcshiftsolutions commented 4 years ago

Hi @jefkel - I kept the diagram a bit tight to highlight the issue. This is the first SAGA service we're setting up for the solution, but we now have a second SAGA service in play with more on the way. We're concerned with going over in the common namespace for resources, and we're trying to keep the environments organized by keeping only Education common components within the (common) namespace. Business logic related to PEN should stay in the PEN namespace.

We're actively growing the solution; in the near future we'll have other Education teams/namespaces looking to integrate their own SAGA against NATS and the common namespace. Hopefully we're at OS4 by then, but in the meantime, we don't have a lot of wiggle room given our resource constraints and we're trying to ensure we're organizing our services for OS4.

mitovskaol commented 4 years ago

@arcshiftsolutions Am I understanding correcrly that with this design every namespace like PEN that will need to make use of the resources in the common namespace, will have to be joined with the common namespace? Joining 2 namespaces in Openshift presents a security risk as it removes the additional security control that Openshift provides through establishing isolation between projects/namespaces with the OCP 3.11 multi-tenant plugin. If you continue with this approach, the security risk will grow will every new namespace that you join with the common components namespace. We can help you with joining the namespaces specified in this request if you are under pressure to deliver the project against the deadline, but would like to encourage you to re-think the design approach going forward. Let me know if you would like us to proceed with joining the namespaces.

arcshiftsolutions commented 4 years ago

@mitovskaol - Hi Olena - we're aware of the security implications with joining namespaces. Unfortunately, there are no other options when using products which work exclusively over TCP (NATS). There are alternatives, but unfortunately they're not technically palatable; tools such as Jaeger could have their agents wrapped in an HTTP server, but the added latency and security overhead again defeated much of the purpose of this type of tool/product. We did explore alternatives in the RocketChat channels; where we landed was exactly where the Registries' team did...we'll need to have the namespaces joined if we want to use NATS cross-namespace.

I understand things will change in OCP4+; our team is continuing to design our architecture with this in mind. Our design is simple; common components in the shared namespace and the business specific components in their own namespace(s). This allows for correct security for each business unit at the namespace level while leaving the common components shared.

We're looking to move forward in the short term with NATS (our code is built, we're waiting on this decision for joining namespaces for deployment). Our current design has two of our SAGA services in the PEN namespace (c2mws), along with 1 other SAGA service in the Common namespace (mvubjx). All of these connect to NATS via TCP as well as our Patroni SAGA instance. All that to say, if we can get these namespaces joined, it will unblock us to proceed with our work.

Thanks!

jefkel commented 4 years ago

This allows for correct security for each business unit at the namespace level while leaving the common components shared.

Clarification: The above is not what you are requesting. Joining namespaces in OCP3 is the same effect as putting all the namespaces that are joined into the same network. (ie: mesh connectivity, NOT spoke and wheel as your statement implies)

jefkel commented 4 years ago

One other comment re: shared DB services:

I'm not sure why you are looking at standing up a single, shared instance of Patroni. General good practice is to create a separate database deployment for every schema:

The business justification used by the Registries team was specifically to reduce the costs associated with licensed DB Software (Enterprise DB), and will likely be a pain point in the future once different performance profiles become a concern with heavier use.

arcshiftsolutions commented 4 years ago

Hi @jefkel - was thinking more along the lines of secrets, config maps, etc...objects which require membership to the namespace for access (please correct me if membership is also shared while joining...it's new to me). It's not really a concern for us in that our team will be managing both namespaces for the forseeable future.

In terms of the Patroni instance; our plan was to use a single Patroni instance for SAGAs, but with multiple schemas as you note. We're using Flyway for our automated rollouts. We went this direction to keep our namespace resources down and lessen our required maintenance/backups. Performance wasn't a huge concern given that processes using this DB are asynchronous. I'm happy to revisit that with my team given your advice above.

Would like to ask how long it takes to join the namespaces once it is approved? We're just planning our next upcoming work and would like this included if possible.

jefkel commented 4 years ago

Object access/namespace membership/access is still separated by namespace as you mention. I wanted to make sure that the network security aspect of the join was understood.

WRT the comment by @mitovskaol about the design pattern going forward, I would echo her caution about creating non-http shared services. This type of service will require additional operational overhead (even in OCP4).

Once approved, the implementation is just as fast as we can document/PR the changes and automate the commands.

Details we will need are the specific namespaces to be joined: namespaceA: {list of namespaces to join to namespaceA}

arcshiftsolutions commented 4 years ago

Thanks @jefkel - here is the namespace detail for our join:

PEN Namespace (c2mvws) <> Education Common Services (mvubjx)

jefkel commented 4 years ago

Those look like project set names, not namespace names. Please be as specific as possible. (ie: do you need c2mvws-tools joined to mvubjx-tools? etc..)

arcshiftsolutions commented 4 years ago

Apologies @jefkel - here they are:

c2mvws-dev <> mvubjx-dev c2mvws-test <> mvubjx-test c2mvws-prod <> mvubjx-prod

jefkel commented 4 years ago

are you certain you need the -tools namespaces joined? (I guess I should have used a different example above? I just grabbed the first one in my window to cut and paste)

arcshiftsolutions commented 4 years ago

Good question @jefkel; I don't believe we'll need tools joined (just updated my comment above).

c2mvws-dev <> mvubjx-dev c2mvws-test <> mvubjx-test c2mvws-prod <> mvubjx-prod

I've followed up with my team around our SAGA implementation and why we can't use an HTTP based solution; there simply aren't any messaging providers which work with HTTP (we're also looking for async). Further to that, our common microservices are all central players in our SAGA flow; they need access to NATS. We did explore several different options (asking other platform teams, researching online), but unfortunately we don't have any alternatives at this time.

mitovskaol commented 4 years ago

It looks like the issue has been discussed extensively and the EDUC team made decision to proceed with joining the namespces. @sbarre-esit can you please work with the EDUC team to test and implement the change.

jefkel commented 4 years ago

I'll coordinate with @caggles on creating an appropriate PR for @sbarre-esit to action.

arcshiftsolutions commented 4 years ago

@mitovskaol @jefkel - Thank you both for your advice & help with this item. Our team really appreciates you looking over our architecture and providing direction.

StevenBarre commented 4 years ago

@arcshiftsolutions when would you like this done? There will be some downtime to your networking as the projects are joined.

arcshiftsolutions commented 4 years ago

Hi @sbarre-esit - for the DEV/TEST environments we're flexible. For Production we'll have to schedule it likely during one of our support team's outage windows. I'll have to confirm with them.

StevenBarre commented 4 years ago

Dev and Test are now joined.

arcshiftsolutions commented 4 years ago

Thanks @sbarre-esit - confirmed it's working. I'm going to reach out to you on RC with Shari around Production timing.

StevenBarre commented 4 years ago

PROD ns joined as well now. This can be closed.

arcshiftsolutions commented 4 years ago

@sbarre-esit @jefkel @mitovskaol @caggles @ShellyXueHan.

Thank you so much from the Education team for getting this in for us!! :)

requestron[bot] commented 4 years ago

This issue has now been closed. It has been completed, unless a comment indicates otherwise.

If you have additional problems or questions, please feel free to ask the community on RocketChat on the #devops-howto channel!