A service is required to build the static consumer ui when necessary

zadigus commented 4 years ago

For the sake of better security, SEO, and performance, the consumer-ui is planned to be deployed as a static asset following JAMStack philosophy. In order to make that happen, we need to build up a service triggering consumer-ui builds and deployments upon relevant database changes.

The flow would be as follows:

change relevant data in database
trigger static website generation
smoke test the generated website
- the site is static so no need to talk to the backend during the tests
- smoke tests need to be very fast, they should just assess whether or not the website is up or not
deploy the generated website to the production environment

In step 3., we probably need to only roughly smoke test without confronting anything to the database. Verifying that the website was hydrated with the correct data was done by the e2e testing before promoting the master branch to production. We only need to ensure that the generated website is up and running, that's it.

zadigus commented 4 years ago

@donnerc do we want to build that service in relation with https://github.com/shopozor/services/issues/42? We will very probably serve static assets by means of minio, it might make sense to make the static website assets available through that service too, right? minio has client sdks for almost any language, so you could make up the website generation service with Go and make it use the minio client sdk for go to publish the assets to the minio service. The minio service would then deploy the website to its Jelastic environment. Or we dedicate a separate service to website generation. Maybe going through the minio service complicates the architecture.

zadigus commented 4 years ago

@donnerc still no input here. How do you want to proceed? Do you want to

let the minio service deal with the static html assets?
make up a flow where the static site generation service would push the html to a git which would then trigger the build of a docker image which would then trigger a deployment on our system (which could then be first validated on a staging environment and then deployed to production)? we can probably put up any flow here; the important thing is that such a flow would enable easy rollback in case of disastrous build (which should in fact almost never happen)
create a persistent volume that'd be shared by the static site generation service and the consumer ui ?
something else ?

zadigus commented 4 years ago

Maybe we can use this package to generate the static site more efficiently.

zadigus commented 4 years ago

As I got no feedback from @donnerc, here's the way we will proceed:

in our dev environment, we create a PVC that is used by both the site generator service and the consumer-ui frontend
in both the staging and production environments
1. the site generator generates a new static site
2. the site generator pushes the static site to git
3. a service git-syncs and smoke tests the newly generated site
4. upon successful smoke tests, the generated site is actually deployed; there we probably just push to some branch of the generated website git repo; when some branch is updated, then the staging or production environments are deployed with the newly generated website.

Details will come up while we develop the service. I'm not sure yet how to organize the necessary webhooks.

zadigus commented 4 years ago

In the env environment, env in {staging, production}, the flow should be

site generator service pulls remote pre-env branch of the static site repo
site generator service generates new static site
site generator service pushes the new static site to the pre-env branch of the static site repo
the website under test container gets notified thanks to git-sync of new changes to the pre-env branch, syncs with the new website code, purges cache (to make the new site available), and notifies (through a webhook) the static site tester service it is ready for testing
the static site tester service gets notified it should test the new website
the static site tester service smoke tests the pre-env branch on the pre-env k8s environment
upon successful smoke tests, the static site tester service pushes the changes to the env branch of the static site repo
the env static website gets notified of changes to the env branch of the static site repo thanks to git-sync
upon changes to the env branch of the static site repo, the env static site is updated with the new static code and the cache is invalidated following this documentation

The website under test container should not be accessible through the same ingress as the published website. Would the smoke tests fail on the website under test, the maintainer should port-forward the container for temporary access.

shikamu commented 4 years ago

In the env environment, env in {staging, production}, the flow should be

site generator service pulls remote pre-env branch of the static site repo

site generator service generates new static site

site generator service pushes the new static site to the pre-env branch of the static site repo

the static site tester service gets notified thanks to git-sync of new changes to the pre-env branch

the static site tester service smoke tests the pre-env branch on the pre-env k8s environment

upon successful smoke tests, the static site tester service pushes the changes to the env branch of the static site repo

the env static website gets notified of changes to the env branch of the static site repo thanks to git-sync

upon changes to the env branch of the static site repo, the env static site is updated with the new static code and the cache is invalidated following this documentation

I've got a few questions:

what is the step that actually "mounts" or "deploys" the web site to test? in step 4 the tester service gets notified, in step 5 it starts testing, what is the process that makes the site testable? if a step is missing please add it to the flow
who has access to the site that gets tested in step 5 ?
is there a process that makes sure the web site is no longer available after testing finishes (whether successfully or not) ? if so please include it to the flow.

zadigus commented 4 years ago

what is the step that actually "mounts" or "deploys" the web site to test? in step 4 the tester service gets notified, in step 5 it starts testing, what is the process that makes the site testable? if a step is missing please add it to the flow

That's a good question. Maybe I was a bit fast on the testing service. I think we need an additional container serving the static website. I'll update the above steps. The static site tester service is a docker container provided with the necessary cypress tools (e2e framework). It gets notified by the container serving the static website to start the tests. The container serving the static website is in sync (with git-sync) with the git repository containing the static website. Whenever that repository's content is updated, the container serving the static website gets updated too. Upon syncing, a webhook can just purge nginx caching and do whatever is necessary to serve the new static website in that container as well as notify the tester service to start testing.

who has access to the site that gets tested in step 5 ?

This is configurable. Of course, when the tests are successful, we don't care, as software maintainers, about accessing that site. When the tests are failing, however, we for sure want access to it to get some information on what might've happened. I'd say we should just close that site from the outside world by default, and if we need to access it for debugging purposes, then we should just port-forward it temporarily.

is there a process that makes sure the web site is no longer available after testing finishes (whether successfully or not) ? if so please include it to the flow.

By default, it all happens within the cluster; there is no necessity to provide the website under test with an ingress. That's a good point that I'll add to the above flow.

shikamu commented 4 years ago

We need to make sure the access to the website is somewhat restricted. We wouldn't want the site under test that potentially has security flaws to be available to the world. I wonder if the following is doable:

the site under test needs to be inaccessible to the world, only the test service should be able to access it
we need our test service to really provide useful messages upon test failures. If there are bugs that we need to debug, we must be able to reproduce the bugs locally and fix them locally, and then repeat the process from your step 1.
there may be cases where some bugs won't be reproducible locally, mostly for infrastructure reasons. Only in those rare cases shall we be able to temporarily open up the site under test (using a random port if possible) to test there directly. Ideally I'd suggest using ports which are not in nmap's top 1000 (see https://nullsec.us/top-1-000-tcp-and-udp-ports-nmap-default/)

zadigus commented 4 years ago

Well, accessing that website under test should be per se very very rare. We will see in the staging process how well the static site hydration with data is working, with a bunch of data. The only purpose of the smoke test is to verify that the static website, hydrated with the new data coming from the database, is working fine. It's more some kind of "data test" or "data hydration test". The only problems that might pop up will be (100% guaranteed) related to problems in the data stored in our database. In the staging phase, we need to test what would happen if someone tried to make sql injections or stuff like that. In production, I am not expecting to have any issue, if our smoke tests on the staging environment are good enough. That is, btw, a topic where I will need your support @shikamu. The smoke tests will not raise any issue related to the app's pure behavior or function. They will instead detect sql injection problems and other issues with our application's state, i.e. the data stored in our database. Upon sql injection, then our app might look screwed up, which will be caught by the tests.

Now, instead of exposing the website under test, we can automate the capture of screenshots exhibiting the issues, which is something we are already doing in our e2e tests. That would drastically reduce the amount of reasons we would need to directly connect to the website under test.

zadigus commented 4 years ago

So, a first shot is now ready. It does nothing but generating the static website in our development environment as soon as you hit the /generate api call.

cf. https://github.com/shopozor/services/pull/182 cf. https://github.com/shopozor/services/pull/176

zadigus commented 4 years ago

I think it might be worth checking how to do that with our CI / CD. In essence, the goal of this service is to build, test, and deploy, after all.

shopozor / services

A service is required to build the static consumer ui when necessary #131