sandstorm-io / sandstorm

Sandstorm is a self-hostable web productivity suite. It's implemented as a security-hardened web app package manager.
https://sandstorm.io
Other
6.71k stars 705 forks source link

Backup restore fails #3426

Open fvlasie opened 4 years ago

fvlasie commented 4 years ago

Hello everyone,

I am trying to restore a tar of my /opt/sandstorm directory into a new install of sandstorm.

I am following the instructions on this page: https://docs.sandstorm.io/en/latest/administering/backups/#to-back-up-the-entire-sandstorm-server

When I try to sandstorm start I get the following errors in sandstorm/var/log/sandstorm.log

** Starting Sandstorm at: Wed Sep  2 11:41:10 2020
** Starting back-end...
sandstorm/util.c++:654: fatal: *exception = kj/async-io-unix.c++:660: failed: ::bind(sockfd, &addr.generic, addrlen): Permission denied; toString() = unix:/var/sandstorm/socket/shell-cli
stack: 543a89 4ecb39 5d062c 5d0619
*** Uncaught exception ***
sandstorm/run-bundle.c++:2432: failed: expected sandstorm::readAll(inPipe) == "ready"; starting back-end failed
stack: 4e290f 4e05c5 4de3af 4dde1c 632294 631e7b
** Server monitor died. Aborting.

Thank you!

ocdtrekkie commented 4 years ago

Does the sandstorm user account have access to the whole directory?

ocdtrekkie commented 4 years ago

You can see what the install script generally does here: https://github.com/sandstorm-io/sandstorm/blob/master/install.sh#L1458-L1462

fvlasie commented 4 years ago

Thank you for the reply!

I ran chown -R sandstorm:sandstorm /opt/sandstorm and now I get the error

Install directory not owned by root, but you're running as root.
** Server monitor died. Aborting.
zenhack commented 4 years ago

On my system, I see:

$ ls -l /opt/sandstorm/
insgesamt 28
drwx------  3 root root      4096 13. Jun 00:20 downloading.t0lTbF
lrwxrwxrwx  1 root root        36  1. Sep 20:25 latest -> sandstorm-custom.2020-09-01_20-25-09
lrwxrwxrwx  1 root root        16  5. Dez 2019  sandstorm -> latest/sandstorm
drwxr-xr-x 17 root root      4096  6. Jun 16:38 sandstorm-267
-rw-r--r--  1 root root       285 31. Aug 13:00 sandstorm.conf
drwxr-xr-x 16 root root      4096 31. Aug 12:54 sandstorm-custom.2020-08-31_12-55-22
drwxr-xr-x 16 root root      4096  1. Sep 20:24 sandstorm-custom.2020-09-01_20-25-09
drwxrwx--T  3 root sandstorm 4096  2. Sep 00:48 tmp
drwxr-xr-x  6 root root      4096  5. Dez 2019  var
$ ls -l /opt/sandstorm/var/
insgesamt 32
drwxrwx--- 2 root sandstorm 20480  1. Sep 20:26 log
drwxrwx--- 3 root sandstorm  4096 22. Aug 13:52 mongo
drwxrwx--- 2 root sandstorm  4096  5. Dez 2019  pid
drwxrwx--- 8 root sandstorm  4096  1. Sep 20:26 sandstorm

So, most stuff should be owned by root, but with group access for sandstorm for things that are mutable.

fvlasie commented 4 years ago

Thank you for the information!

MongoDB is not starting.

Now I am seeing an error in /opt/sandstorm/var/log/mongo.log which is

2020-09-02T22:53:37.638+0000 [initandlisten] couldn't open /var/mongo/admin.ns errno:1 Operation not permitted

Could I, perhaps, see your permissions for /opt/sandstorm/var/mongo/

Thank you for helping!

zenhack commented 4 years ago
[isd@elf ~]$ ls -l /opt/sandstorm/var/mongo/
insgesamt 163856
-rw------- 1 sandstorm sandstorm 16777216  5. Dez 2019  admin.0
-rw------- 1 sandstorm sandstorm 16777216  5. Dez 2019  admin.ns
drwxrwx--- 2 sandstorm sandstorm     4096  2. Sep 00:48 journal
-rw------- 1 sandstorm sandstorm 16777216  1. Sep 20:26 local.0
-rw------- 1 sandstorm sandstorm 33554432  2. Sep 00:40 local.1
-rw------- 1 sandstorm sandstorm 16777216  2. Sep 00:40 local.ns
-rw------- 1 sandstorm sandstorm 16777216  2. Sep 00:40 meteor.0
-rw------- 1 sandstorm sandstorm 33554432 31. Aug 19:17 meteor.1
-rw------- 1 sandstorm sandstorm 16777216  1. Sep 20:34 meteor.ns
-rwxrwx--- 1 sandstorm sandstorm        0  2. Sep 00:48 mongod.lock
-rw-r----- 1 sandstorm sandstorm       27  5. Dez 2019  passwd
ocdtrekkie commented 4 years ago

We probably should have something like a "run this script to fix file permissions on your Sandstorm install", @zenhack, assuming the defaults the installer uses. Definitely not the first time someone's hit this.

fvlasie commented 4 years ago

Thank you both for your help.

Sandstorm is running now but I continue to get permission errors when trying to use it.

A permissions fix script would be wonderful!

:)

zenhack commented 4 years ago

@fvlasie, if you run:

cd /opt/sandstorm
chown -R sandstorm:sandstorm var/{log,pid,mongo} var/sandstorm/{apps,grains,downloads}
chown root:sandstorm var/{log,pid,mongo,sandstorm} var/sandstorm/{apps,grains,downloads}
chmod -R g=rwX,o= var/{log,pid,mongo,sandstorm} var/sandstorm/{apps,grains,downloads}

Does that quiet the remaining errors?

(Adapted from the install script linked above).

fvlasie commented 4 years ago

Pardon me while I try to get my sandcats domain working again. Then I can address the remaining permissions errors.

Thank you again, your help has been amazing!

ocdtrekkie commented 4 years ago

Let us know if you have any issues restoring your Sandcats setup too!

fvlasie commented 4 years ago

Shall I ask here or open a separate issue?

ocdtrekkie commented 4 years ago

It is still a failure to restore from backup! We do want to make sure this is not painful.

fvlasie commented 4 years ago

The error I am getting is: ERROR: remote exception: remote exception: Error: queryTxt ENODATA _acme-challenge I am guessing it means the _acme-challenge TXT record has no data... This is after running install.sh with the "help"

This seems to be an error on the sandcats.io side.

Thank you!

fvlasie commented 4 years ago

Checking further _acme-challenge.mydomain.sandcats.io is resolving to my IP. It should be resolving to sandcats.io's IP should it not?

ocdtrekkie commented 4 years ago

It points to my IP for my Sandstorm server too.

ocdtrekkie commented 4 years ago

I am kinda wondering if you need to re-set up Let's Encrypt. We have some (not yet in the docs) commands for doing it from the command line:

sandstorm create-acme-account "YOUR_EMAIL" --accept-terms sandstorm renew-certificate

I am not 100% sure how well-tested recovering Sandcats is since we got Let's Encrypt added (you might be the first to do this!) so we might need to fix up some issues.

fvlasie commented 4 years ago

I am thinking it might be a firewall issue. I am running this on an OpenStack cloud and maybe it needs some DNS ports opened to reply to the acme-challenge.

During the new certificate procedure only http is used. Hmmm...

fvlasie commented 4 years ago
sandstorm create-acme-account "YOUR_EMAIL" --accept-terms
sandstorm renew-certificate

Results in the same error unfortunately:

ERROR: remote exception: remote exception: Error: queryTxt ENODATA _acme-challenge.mydomain.sandcats.io
    at QueryReqWrap.onresolve [as oncomplete] (dns.js:203:19)
fvlasie commented 4 years ago

I tried to recover the certificate using a throwaway VM but the error persists even after copying the new cert files into my Sandstorm install. I have opened port 53 to my VM but I still had no success. I am not sure what else to try.

ocdtrekkie commented 4 years ago

The certs are no longer stored in the file system. They're stored in the mongo database now. But Sandcats should be able to create them, I would think.

fvlasie commented 4 years ago

Oh I see! Are there instructions somewhere for doing a manual certificate transfer with the new storage method?

This what I was following: https://docs.sandstorm.io/en/latest/administering/sandcats/#re-installing-sandstorm-and-keeping-your-sandcats-domain

ocdtrekkie commented 4 years ago

Yeah, I think that's just old/wrong. I'm not positive you should need to move anything, because Let's Encrypt should mostly "just work". But I'm not sure how to troubleshoot what's going on. It's possible something was not accounted for in this case when updating how all this works.

@zenhack Any idea what to look for here?

fvlasie commented 4 years ago

Let's Encrypt should mostly "just work".

That makes me smile thinking of the time Let's Encrypt was eating its own configuration files... :) but yes it does work well now and it is fine when renewing certs. It is the subdomain recovery process that is failing.

ocdtrekkie commented 4 years ago

I know that @zenhack did test recovering a Sandcats domain himself, but that was A. on a fresh install and B. with a subdomain that had never used Let's Encrypt before. And the only thing he had to do was run the two commands I gave you above to set up Let's Encrypt and renew the cert manually.

So I am not positive where we're running into an issue. It seems like if the error is related to that TXT record, that we're bombing out on the Let's Encrypt step of things.

fvlasie commented 4 years ago

I have an unusual setup here. I think the only way to get my sandstorm back in business is either to figure out how to put the cert back into mongo or have someone at sandcats HQ delete the subdomain and let me request it again. I did email support at sandstorm dot io but they told me it couldn't be done. :(

Thank you for your continued support!

ocdtrekkie commented 4 years ago

I kinda doubt it's related to the Sandcats reservation anyhow, but the Let's Encrypt side of this, since it's looking for that ACME challenge. I suspect if one reconfigured the server to HTTP (and ideally local.sandstorm.io), and then cleared and reset the HTTPS settings, we might be golden, but I know that's a challenge/security problem if you're hosting this on a cloud server.

ocdtrekkie commented 4 years ago

What does your sandstorm.log file look like these days? You posted one when your permissions were wrong, but not since the HTTPS issue.

fvlasie commented 4 years ago

I hope you had a nice Labour Day weekend!

Here is the log output after attempting certificate retrieval:

ACME.js notification: certificate_order {
  account: {
    key: { kid: 'https://acme-v02.api.letsencrypt.org/acme/acct/95753906' }
  },
  subject: '*.mydomain.sandcats.io',
  altnames: [ '*.mydomain.sandcats.io', 'mydomain.sandcats.io' ],
  challengeTypes: [ 'dns-01' ]
}
ACME.js notification: challenge_select {
  altname: '*.mydomain.sandcats.io',
  type: 'dns-01',
  dnsHost: '_acme-challenge.mydomain.sandcats.io',
  keyAuthorization: 'zSMMZqid33SxdS4TP3DsDxT-ITaKzh_u9XGCw4ji1yU.B3Okk2aMGyUGdU2FO6yI_NwLXW7CiTll-KI8XHRo64o'
}
ACME.js notification: _challenge_select {
  altname: '*.mydomain.sandcats.io',
  type: 'dns-01',
  challenge: {
    identifier: { type: 'dns', value: 'mydomain.sandcats.io' },
    status: 'pending',
    expires: '2020-09-11T05:41:54Z',
    challenges: [ [Object] ],
    wildcard: true,
    type: 'dns-01',
    url: 'https://acme-v02.api.letsencrypt.org/acme/chall-v3/6974730312/Q7XpGw',
    token: 'zSMMZqid33SxdS4TP3DsDxT-ITaKzh_u9XGCw4ji1yU',
    hostname: 'mydomain.sandcats.io',
    altname: '*.mydomain.sandcats.io',
    thumbprint: 'B3Okk2aMGyUGdU2FO6yI_NwLXW7CiTll-KI8XHRo64o',
    keyAuthorization: 'zSMMZqid33SxdS4TP3DsDxT-ITaKzh_u9XGCw4ji1yU.B3Okk2aMGyUGdU2FO6yI_NwLXW7CiTll-KI8XHRo64o',
    dnsHost: '_acme-challenge.mydomain.sandcats.io',
    dnsAuthorization: 'WfooeKXCa-WRQhDkfubG_DNx6Yk0HedOATY3uqUCJr0',
    keyAuthorizationDigest: 'WfooeKXCa-WRQhDkfubG_DNx6Yk0HedOATY3uqUCJr0',
    dnsZone: 'mydomain.sandcats.io',
    dnsPrefix: '_acme-challenge'
  }
}
ACME.js notification: challenge_select {
  altname: 'mydomain.sandcats.io',
  type: 'dns-01',
  dnsHost: '_acme-challenge.mydomain.sandcats.io',
  keyAuthorization: 'WXLU1xc-DkZ86u7HAohexj3FrL0AbKYkYUbklJbQlYM.B3Okk2aMGyUGdU2FO6yI_NwLXW7CiTll-KI8XHRo64o'
}
ACME.js notification: _challenge_select {
  altname: 'mydomain.sandcats.io',
  type: 'dns-01',
  challenge: {
    identifier: { type: 'dns', value: 'mydomain.sandcats.io' },
    status: 'pending',
    expires: '2020-09-11T05:41:54Z',
    challenges: [ [Object], [Object], [Object] ],
    type: 'dns-01',
    url: 'https://acme-v02.api.letsencrypt.org/acme/chall-v3/6974730315/GmC5PA',
    token: 'WXLU1xc-DkZ86u7HAohexj3FrL0AbKYkYUbklJbQlYM',
    hostname: 'mydomain.sandcats.io',
    altname: 'mydomain.sandcats.io',
    thumbprint: 'B3Okk2aMGyUGdU2FO6yI_NwLXW7CiTll-KI8XHRo64o',
    keyAuthorization: 'WXLU1xc-DkZ86u7HAohexj3FrL0AbKYkYUbklJbQlYM.B3Okk2aMGyUGdU2FO6yI_NwLXW7CiTll-KI8XHRo64o',
    dnsHost: '_acme-challenge.mydomain.sandcats.io',
    dnsAuthorization: 'OEuuOSYAfCKo1SynCV9G_1ou_HBgprZvS0HND0Aa_h4',
    keyAuthorizationDigest: 'OEuuOSYAfCKo1SynCV9G_1ou_HBgprZvS0HND0Aa_h4',
    dnsZone: 'mydomain.sandcats.io',
    dnsPrefix: '_acme-challenge'
  }
}
Failed to renew certificate (will try again in 6 hours): Error: queryTxt ENODATA _acme-challenge.mydomain.sandcats.io
    at QueryReqWrap.onresolve [as oncomplete] (dns.js:203:19)
sandstorm/gateway.c++:951: info: Loading TLS key into Gateway
ocdtrekkie commented 4 years ago

Perhaps @kentonv can look at this? This isn't the only Sandcats/Let's Encrypt-related issue reported recently now.