cern-eos / eos

EOS Storage
Other
211 stars 39 forks source link

eosxd not starting due to unset credential-store #37

Closed panik4 closed 2 years ago

panik4 commented 2 years ago

Hi,

when starting eosxd, I get the following error (servername is a replacement for the real hostname):

# eosxd -f  -ofsname=servername /eos
# fsname='servername'
# -o allow_other enabled on shared mount
# -o big_writes enabled
# JSON parsing successful
# no config file for local overwrites
# dead mount detected - forcing 'umount -l /eos'
# Running with XRD_NODELAY=1 (nagle algorithm is disabled)
# Setting MALLOC_CONF=dirty_decay_ms:0
terminate called after throwing an instance of 'FatalException'
  what():  Cannot stat uuid-store repository: 
Aborted

Now I traced this back in the sources to root["auth"]["credential-store"] not being set. I just can't figure out why the default values aren't used.

I have a fuse.servername.conf, which is loaded successfully. I have also tried to define "credential-store" : "/var/cache/eos/fusex/credential-store/", but this didn't work as well.

When running eosxd -f -ofsname=servername.gsi.de /eos, the mount works (for reading), as the default config permission settings are not correct, but /var/cache/eos/fusex/credential-store/ does get created.

Thanks, Paul fuse.servername.txt

apeters1971 commented 2 years ago

Did you install our RPMs or you compiled your own software? Do you run with SE Linux enabled? Could be due to SE linux policies ...

panik4 commented 2 years ago

I am using your RPMs, and selinux only reports the crash with: comm="eosxd" reason="memory violation"

Searching the error in the code, this is where it's thrown: https://github.com/cern-eos/eos/blob/73800f55f773a31cfaf4cceb781a8ab7e25719f4/fusex/auth/UuidStore.cc#L43

As the error reports, the string "repository" is empty.

And in contrast, starting the daemon with eosxd -f -ofsname=servername.institute.de /eos, which is the FQDN of the MGM, the default path for the credential store gets created. In this case, the default config is used. The mount is there, the directories are shown, but of course permissions aren't correct, as the kerberos lines in our fuse.servername.conf aren't used.

Now why would the credential store be empty when using an existing config?

Thanks, Paul

apeters1971 commented 2 years ago

I would do it like this:

If you want to mount EOS from machine "eos.gsi.de" you do:

eosxd -ofsname=eos.gsi.de:/eos/ /eos/
or 
mount -t fuse eosxd -ofsname=eos.gsi.de:/eos/ /eos/

If your server name has some 'fancy' subdomain or characters, that can be a problem.

With a config file you can do e.g.:

cat /eos/fuse/eos.default.conf
{
  "hostport" : "eos.gsi.de"
}
mount -t fuse eosxd -ofsname=default /eos/

By default, the mount supports almost all authentications besides X509, so it is really rare you need a config file at all.

panik4 commented 2 years ago

Thanks again. I had done most of your recommended steps already. I sadly can't report what the exact problem was, as only a full reinstall of the mgm fixed it. It used the same chef role files as before, but something in the state of the machine must've changed in its use beforehand. I'll report the issue again if it ever occurs again and I find out why.

Thanks, Paul