dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
285 stars 136 forks source link

systemd: dcache@*.service cannot be enabled #3630

Open calestyo opened 6 years ago

calestyo commented 6 years ago

Hey.

Seems no one really ever tested the new systemd units ;-)

When one tries to enable a dcache@*.service, to have it automatically startup on next boot, one gets e.g.

# systemctl enable dcache@admin.service 
Failed to enable unit: Unit /run/systemd/generator/dcache@admin.service is transient or generated.

This is because of the unit files being created completely in the generator (and not just symlinks, to one central dcache@.service, where the instance name is used with e.g. %i.

3628 partially solves this, as one can enable the global dcache.service I introduce there, but it doesn't give the flexibility which is even advertised in the release notes :-P

I think the only way would be to have a /lib/systemd/system/dcache@.service, at first I was kinda surprised that you don't do this anyway, but I guess the reason is you want the per domain config options like dcache.restart.delay in the unit file... with values as at the time when systemd is re-execed.

Not sure what's the best approach... creating the whole dcache@*.service files in the generator is probably not the cleanest/intended way though.

And I don't think adding a "static" dummy /lib/systemd/system/dcache@.service in addition to the dcache@*.service would be a good idea.

Any thoughts?

calestyo commented 6 years ago

I was asking around at IRC, and what they told me there is in principle that creating real foo@bar.service units (i.e. as real files) is not supposed to be done... and may stop working any time.

When one uses such "instance units" (i.e. with an @) there should be allegedly really just one file that is named foo@.service... and that uses then %i and friends to create instances thereof.

While it's easy to create such a dcache@.service (the generator would still need to be kept for creating symlinks for each existing domain).... I have no real idea right now, how to do the config value parsing...

CC: @gbehrmann

calestyo commented 6 years ago

I think the way to go might be the following:

Now that env file is obviously transient... and it should go into some /run/... location.

Am I right that dcache already creates a configuration cache of the config in /etc/ ... and uses that cache instead of the /etc/dcache/ stuff... so that changes to /etc/dcache/* only get picked up when a domain restarts? This is in principle good... but I think it shouldn't be in /var/lib/ if /var/lib/dcache/config/cache but possibly also some /run/ location. Same probably for the poolmanager.conf ... this is neither canonical config (thus not etc) nor precious (as ZK has the config)... thus iMO /var/lib doesn't qualify either. What do you think?

Anyway... I think the env file for systemd, should be in the same place than the dcache config cache... (and these two should always be created the same time).

Once we have this, we can simply use the env file in the systemd instance unit file... and voila we have everything together that we need... plus we don't create real foo@bar.service units, which it seems is really not forseen.

What do you think? :)

Cheers, Chris.

calestyo commented 6 years ago

Just noted, that the above won't work, as env vars set via EnvironementFile= are only available to the executed processes... not to the unit file itself (i.e. not to directives RestartSec=). So we cannot set:

WorkingDirectory=${HOME}
RestartSec=${RESTART_DELAY}
User=$USER

Now the thing is that in the systemd-world these settings are not considered dCache configuration, so they shouldn't be set there (canonically) and just copied by systemd, but rather vice versa.

calestyo commented 6 years ago

So after some thinking the following might be possible:

in addition: dcache developers should discuss, which of the options used in the current unit files are actually dCache options and which should rather go completely into systemd config.

For example: IMO, the user is rather a systemd config. So the consequence would be to drop dcache.user and tell people instead how they can set this the systemd way.

Not sure whether the others in the current unit files: dcache.restart.delay, dcache.java.options, dcache.home, CLASSPATH and dcache.java.library.path are really needed within dCache config. If so, one could at least make all but HOME, RESTART_DELAY and USER into an environment file, which is created by the generator per domain.

But I think it's anyway better to move those that dCache itself doesn't need into systemd.

Now how would dCache admins e.g. change the user? There's a mechanism in systemd, which allows one to just override certain settings in a unit file (and not having to copy&modify the whole one). One would e.g. create a file like: /etc/systemd/system/dcache@webdav.service.d/local.conf which contains just User=root For all other settings the system-wide dcache@.service unit would be used. See systemd.unit(5) manpage... section "Overriding vendor settings". I hope that this mechanism works also for instance units....

What do you think?

kofemann commented 6 years ago

Hi Chris,

I think it make sense to use as much of systemd as possible and drop redundant configuration options from dCache. However, we can do that only with deb and rpm packages will be 100% systemd.

calestyo commented 6 years ago

I've thought about that problem as well... but that also means that development is kinda stalled forever, as SL6 will be there... well not forever, but close to ;-)

Do you see any other way how we can "move forward" for those systems that support systemd? What e.g. about saying to sites: if you have systemd, then e.g. dcache.user will have no effect anymore and is ignored?

But to get the above done I need some help from you guys :-)

I can provide patches for the systemd units, but I don't know about the following:

(The same questions also at least for HOME, RESTAR_DELAY and the other env vars used in the unit file,... which are not in the exec directives (e.g. CLASSPATH, we could still take from EnvFile).

Thanks :)

calestyo commented 6 years ago

Oh, and without doing anything here, or at least merging #3628, dCache 3.2 is currently unusable on Debian... at least in the sense, that one cannot make it start automatically ;)

calestyo commented 6 years ago

@kofemann Maybe one could also do the following to keep dCache more homogeneous for systemd/non-systemd

For all the settings which are not really dCache settings (e.g. dcache.user) but which should rather go into systemd:

calestyo commented 6 years ago

There was now quite some discussion in the ticket I opened over at systemd - with the outcome only partially clear (at least to me).

AFAIU, creating real foo@instance.service file as our generator does right now is in principle supported but these are then considered overridings (of the template foo@.service - which we don't have). It wasn't confirmed yet explicitly, but I think the conclusion is, that a foo@.service is required. (See there for more discussion about why from the systemd POV it's reasonable then to not let one enable the dcache@domain.serivces we have - without having a dcache@.service).

Lennart, AFAIU pointed out, that we mis-use the feature of instance units right now (as we don't use a template, i.e. a dcache@.service) and he suggested one way would be to simply create dcache-.service files - that is, if we want to keep things like dcache.user within dCache.

While this would certainly work, I still think it's not the right solution. Our domains are in fact instances of one and the same service, even the same jar file is executed, the same config is read, and so one.

So I still think the approach I've outlined above is the right way to go.

VilleS1 commented 1 year ago

I think RHEL9 is 100% systemd. The dcache systemd scripts seem to be broken.

kofemann commented 1 year ago

@VilleS1 Any details? As our daily builds run on CentOS-7 and CentOS-9

VilleS1 commented 1 year ago

Ok, maybe it was just this:

kofemann commented 1 year ago

the hostname is expected to be hostname of your system:

# cat /etc/dcache/dcache.conf

# hostname
dcache-lab007
# ls -l /etc/dcache/layouts/
total 4
-rw-r--r-- 1 root root 3193 May 11 12:30 dcache-lab007.conf
# 

can you provide the output similar commands on our system?

VilleS1 commented 1 year ago

[root@front1 ~]# hostname front1.domain.com [root@front1 ~]# ls -l /etc/dcache/layouts/ total 4 -rw-r--r--. 1 root root 756 May 17 07:26 front1.domain.com.conf

kofemann commented 1 year ago

Can you post your laytout file?

VilleS1 commented 1 year ago

[${host.name}Domain] dcache.broker.scheme = core

[${host.name}Domain/admin] [${host.name}Domain/pnfsmanager] pnfsmanager.default-retention-policy = REPLICA pnfsmanager.default-access-latency = ONLINE

[${host.name}Domain/cleaner] [${host.name}Domain/poolmanager] [${host.name}Domain/billing] [${host.name}Domain/gplazma] [${host.name}Domain/webdav] webdav.authn.basic = true

[${host.name}Domain/nfs] nfs.version = 4.1

[${host.name}Domain/pool] pool.name=${host.name}-pool pool.path=/srv/dcache/pool pool.size=10G pool.wait-for-files=${pool.path}/data

kofemann commented 1 year ago

Looks good. What about: systemctl list-dependencies

# systemctl list-dependencies dcache.target
dcache.target
● ├─dcache@cleaner.service
● ├─dcache@core-dcache-lab007.service
● └─dcache@poolA.service
VilleS1 commented 1 year ago

[root@front1 dcache]# systemctl list-dependencies dcache.target dcache.target ● └─dcache@front1Domain.service

kofemann commented 1 year ago

BTW, we can switch to supportԹdcache.org, to avoid making details of your setup public.

VilleS1 commented 1 year ago

yes

kofemann commented 1 year ago

ok, the open a support tocket.

VilleS1 commented 1 year ago

opened