EIDA / mediatorws

EIDA NG Mediator/Federator web services
GNU General Public License v3.0
6 stars 6 forks source link

Add docker scripts #38

Closed Jollyfant closed 5 years ago

Jollyfant commented 6 years ago

Dockerfile and some static configuration to get the services up and running. I adapted this from the Dockerfile of Massimo.

Jollyfant commented 6 years ago

Let's say it's still a work in progress and I would like some help from Massimo. I managed to set up rsyslog in the container by starting the service, but couldn't figure out how to do it through Dockerfile. It can have only a single entrypoint and that is apache.

Same for daily harvesting, need to think of how to do it. Maybe we can get cron to do it.

Gotcha, I will update the config file. I took the old one from Massimo.

Python3 is probably no problem

damb commented 6 years ago

Same for daily harvesting, need to think of how to do it. Maybe we can get cron to do it.

So do I at federator-testing.ethz.ch.

Python3 is probably no problem.

:+1:

damb commented 6 years ago

I managed to set up rsyslog in the container by starting the service, but couldn't figure out how to do it through Dockerfile. It can have only a single entrypoint and that is apache.

Did you see this?: https://docs.docker.com/config/containers/multi-service_container/

Jollyfant commented 6 years ago

Yeah this is a good next step but I want to have Apache up and running first..

Did you see this?: https://docs.docker.com/config/containers/multi-service_container/

andres-h commented 6 years ago

Did you see this?: https://docs.docker.com/config/containers/multi-service_container/

BTW, we use https://github.com/phusion/baseimage-docker

damb commented 6 years ago

@andres-h, thanks. In order to have a lightweight container providing enhanced system facilities I guess lxc/lxd is the type of container you're looking for. I mostly use it for development purposes. Docker on the other hand seems to target more on application deployment. See also: https://unix.stackexchange.com/questions/254956/what-is-the-difference-between-docker-lxd-and-lxc

andres-h commented 6 years ago

Yes, I know lxc. I was suggesting baseimage-docker, because I understood that you were looking for Docker solution running multiple services in a single container. We are using baseimage-docker for running small groups of services (eg., web server, syslog, cron) within one container and it has many advantages compared to lxc.

damb commented 6 years ago

There is a image provided at Docker Hub. See also the discussion #37 and #39.

Jollyfant commented 5 years ago

Hello again

Jollyfant commented 5 years ago

In the process of updating it to address the remaining concerns.

Jollyfant commented 5 years ago

Can all services write to the same log file or is that a big no-no?

Jollyfant commented 5 years ago

By the way, it works! So we will deploy this on our new VM that was scheduled to be delivered to us two weeks ago. :balloon:

damb commented 5 years ago

@Jollyfant, first of all: Good job :+1:!

Added rotating log handler once per day

This is a solution. Even more stable would be the use of a WatchedFileHandler + logrotate (e.g. once per day). This approach doesn't crash if the logfile is removed by accident.

Can all services write to the same log file or is that a big no-no?

It is quite easy to configure two log files.

  1. Provide both fed-logging.conf and stl-logging.conf
  2. Set path_logging_conf for each service within eidangws_config accordingly

BTW: Is there a reason why you decided going for internal logging (i.e. log to files) instead of sending logs to a logging deamon via e.g. SysLogHandler?

All *.wsgi files are still configured to use venv but none are available in side the container. They are probably ignored when they can't be found?

Nope. Your config does not use any virtual environment setup. See also the corresponding mod_wsgi description. Within {federator,stationlite}.wsgi the code related on virtual environments is commented out.

damb commented 5 years ago

Hi @Jollyfant,

Still 3 points left:

I'm going to merge as soon as we've completed those tasks.

Thx and cheers

Jollyfant commented 5 years ago

When using a single log file log messages might mix up. There is nothing serializing log messages such as a rsyslog deamon. Hence, as long as you're directly logging to files I strongly opt for a setup with two separate log files.

Done.

What about installing the eida-federator and stationlite into separate virtual environments?

Done.

Finally, would you also be so kind to add a few words to the docs?

Not yet but I will, no problem.

I think you (or anyone besides me) should test the image building/running before merging.

damb commented 5 years ago

I think you (or anyone besides me) should test the image building/running before merging.

I can do that. That's fine.

Jollyfant commented 5 years ago

Ok, make sure to use the docker-compose.yml so you get the mounts right.

damb commented 5 years ago

Hi @Jollyfant,

to keep it short:

I've appended a patch with the issues above solved. Simply apply the patch to your feature branch. The docs are still missing.

In future, IMO we should announce new features at eida_maint@gfz-potsdam.de when we can provide a stable, tested solution. Thanks.

cheers

0001-FIX-DOC-Testing-feature-docker.txt

Jollyfant commented 5 years ago

You copy an empty stationlite database, however afterwards you mount the volume to this directory. Instead, after running the container, once initially, we have to create a new DB and harvest. E.g.

When building the image, yeah. It's intially missing from db/ so when we run the container the first time there will be a stationlite.db that is empty. Then you run harvest once and it gets filled. Now because the directory is mounted outside the container the file is persistent. Next time you run the container you got stationlite up and ready to go.

You did not install the services into the virtual environments. You created them, yes, however, your services were still installed globally. Within your Dockerfile you need something like (to be adapted for stationlite, too.):

Oh I just configured them to be used in *.wsgi with python-home.

I have no problem with the log directory after chowning it to www-data. Apache needs to be able to write in the folder that is normally owned by root.

In future, IMO we should announce new features at eida_maint@gfz-potsdam.de when we can provide a stable, tested solution. Thanks.

I disagree, since we should look for contributions. The EIDA maintenace list is not an announcement board to show off on.

damb commented 5 years ago

When building the image, yeah. It's initially missing from db/ so when we run the container the first time there will be a stationlite.db that is empty. Then you run harvest once and it gets filled. Now because the directory is mounted outside the container the file is persistent. Next time you run the container you got stationlite up and ready to go.

Well, in my case this approach did not work. When mounting a volume to a directory files within this directory are hidden. I did not know that Docker intends to copy those files to the volume mounted on top.

Apache needs to be able to write in the folder that is normally owned by root.

The user running the mod_wsgi deamon processes (i.e. www-data) needs the correct permissions. Not root.

Jollyfant commented 5 years ago

Applied your patch

The user running the mod_wsgi deamon processes (i.e. www-data) needs the correct permissions. Not root.

Correct, so this line in the Dockerfile does it:

RUN mkdir -p /var/www/mediatorws/log && chown www-data:www-data /var/www/mediatorws/log
Jollyfant commented 5 years ago

Did you drop the obspy dependency?

damb commented 5 years ago

Did you drop the obspy dependency?

Yes. Before it was installed globally. However, when using virtual environments global dependencies aren't taken into consideration anymore. So the global installation was useless in any case. It is installed into the virtual environment of stationlite, though.

damb commented 5 years ago

Correct, so this line in the Dockerfile does it:

RUN mkdir -p /var/www/mediatorws/log && chown www-data:www-data /var/www/mediatorws/log

Right, I also had issues with /var/tmp/ and db/.

Jollyfant commented 5 years ago

Ok rebase this and added some documentation. I didn't touch what you wrote about your image, so you can review whether that is still up-to-date.

Can you try with a clean installation following the installation documentation? See if that works for you.

I have one problem with adding station-harvest logging, can you spot why it's not writing to file?

damb commented 5 years ago

Well, in my case this approach did not work. When mounting a volume to a directory files within this directory are hidden. I did not know that Docker intends to copy those files to the volume mounted on top.

I checked once again the approach we've chosen (i.e. volume type bind). If I'm using type: volume Docker copies files to the mounted volume which corresponds to the behaviour you described initially. That is why I'd rather prefer a setup with volumes managed by Docker.

Attached a patch which implements volumes managed by Docker including some minor adjustments.

Besides I checked our changes and adjusted the docs.

I have one problem with adding station-harvest logging, can you spot why it's not writing to file?

For me it worked (You have to run eida-stationlite-harvest using the interpreter from the virtual environment.). I invoked

$ docker exec <container_name>  /var/www/stationlite/venv3/bin/eida-stationlite-harvest  --nodes-exclude ingv -- sqlite:////var/www/mediatorws/db/stationlite.db

(I had to exclude INGV since an invalid value for a <Dip></Dip> property is configured (for webservices.ingv.it/fdsnws/station/1/query?network=IV&station=TB02&location=21&channel=DH2&level=channel). This results in an invalid StationXML. @massimo1962 knows about this issue.)

Attachments: 0001-WIP-Use-volumes-managed-by-Docker.txt

damb commented 5 years ago

@Jollyfant, if you could check the most recent state once again. If you have no objections I'm finally going to merge. Thanks.

Jollyfant commented 5 years ago

I checked once again the approach we've chosen (i.e. volume type bind). If I'm using type: volume Docker copies files to the mounted volume which corresponds to the behaviour you described initially. That is why I'd rather prefer a setup with volumes managed by Docker.

Ok let me try this, will do it tomorrow. There is a little bit more documentation that needs to be writted to make sure permissions for log writing are OK.

For me it worked (You have to run eida-stationlite-harvest using the interpreter from the virtual environment.). I invoked

Yep, need to fix this in Cron.

(I had to exclude INGV since an invalid value for a property is configured (for webservices.ingv.it/fdsnws/station/1/query?network=IV&station=TB02&location=21&channel=DH2&level=channel). This results in an invalid StationXML. @massimo1962 knows about this issue.)

Does it not skip this particular channel?

damb commented 5 years ago

Yep, need to fix this in Cron.

Correct.

There is a little bit more documentation that needs to be written to make sure permissions for log writing are OK.

With the proposed setup permissions aren't an issue anymore. Permissons are configured within the Dockerfile and then propagated the volume mounted on top.

Does it not skip this particular channel?

Skipping a single channel is absolutely impractical. I could only exclude a single route.

My first approach for improperly formatted StationXML was to stop harvesting completely (IMO as a minimum requirement fdsnws/station?format=xml must at least return valid StationXML). Unfortunately this isn't the case. So, with ea994f0 I decided skipping the corresponding EIDA node. Hence, to be discussed is only the fact where I catch the exception i.e. if I

from harvesting in case such exceptions occur. But, I think this discussion is quite off-topic within this PR.

Jollyfant commented 5 years ago

With the proposed setup permissions aren't an issue anymore. Permissons are configured within the Dockerfile and then propagated the volume mounted on top.

Ok, do you know how to configure the location on the host where Docker stores these volumes? I don't want them under the default /var/lib/docker/volumes/ but soimewhere else.

Copying of stationlite.empty.db is no longer required it seems, correct? I can remove this from the docs.

damb commented 5 years ago

Copying of stationlite.empty.db is no longer required it seems, correct? I can remove this from the docs.

Copying is done within the Dockerfile.

Ok, do you know how to configure the location on the host where Docker stores these volumes? I don't want them under the default /var/lib/docker/volumes/ but soimewhere else.

Is this useful?:

https://stackoverflow.com/questions/36014554/how-to-change-the-default-location-for-docker-create-volume-command

Jollyfant commented 5 years ago

Hmm no, I cannot touch the docker daemon. Anyway, we can leave this as the default for the deployment but I might change some volume settings running this @ ODC.

damb commented 5 years ago

That's weird, the patch from above contained already an updated version of the docs. Is there a reason why you didn't apply it with d3864fe? There, I also mentioned the persistent volumes managed by Docker ...

Jollyfant commented 5 years ago

That's weird, the patch from above contained already an updated version of the docs. Is there a reason why you didn't apply it with d3864fe?

Not that I know of.. are the docs OK?

damb commented 5 years ago

Not that I know of.. are the docs OK?

Mmh. I also adjusted the text regarding the previous first Docker image version. That is why I'd prefer the version contained in the patch. Of course, feel free to modify it.

@Jollyfant, I didn't write contribution guidelines yet. But if I would they for sure would contain:

Thx.

Jollyfant commented 5 years ago

Okidoki, last review and let's merge it

Jollyfant commented 5 years ago

I'll leave the honor of merging this marvel to you

damb commented 5 years ago

@Jollyfant, thanks. Finally you decided not to mention the volumes managed by Docker. I'm fine with that.

Some future improvements would include:

But let's keep it for the future.