carte018 / Proconsul

Proconsul: Console Access for the Provinces
Other
6 stars 2 forks source link

Proconsul not starting properly #14

Open cvera86 opened 3 years ago

cvera86 commented 3 years ago

We build the image running the following command ./build -c master.config -r -q ./run3.sh to start the proconsul to see which docker is running docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES faa2acdbcea5 carte018/proconsul "/usr/bin/supervisord" 2 days ago Up 2 days epic_panini

I open Chrome and go to https://workingwith.ibrsp.org/proconsul/ Message states it cannot find the site.

Looking at the logs

docker logs faa2acdbcea5 /usr/lib/python2.7/dist-packages/supervisor/options.py:295: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security. 'Supervisord is running as root and it is searching ' 2021-07-14 07:17:33,898 CRIT Supervisor running as root (no user in config file) 2021-07-14 07:17:33,898 WARN Included extra file "/etc/supervisor/conf.d/99local.conf" during parsing 2021-07-14 07:17:33,922 INFO RPC interface 'supervisor' initialized 2021-07-14 07:17:33,922 CRIT Server 'unix_http_server' running without any HTTP authentication checking 2021-07-14 07:17:33,922 INFO supervisord started with pid 1 2021-07-14 07:17:34,924 INFO spawned: 'shibd' with pid 9 2021-07-14 07:17:34,926 INFO spawned: 'tomcat' with pid 10 2021-07-14 07:17:34,927 INFO spawned: 'docker-gen' with pid 11 2021-07-14 07:17:34,928 INFO spawned: 'fixperm' with pid 12 2021-07-14 07:17:34,929 INFO spawned: 'mysqld' with pid 13 2021-07-14 07:17:34,931 INFO spawned: 'apache' with pid 15 2021-07-14 07:17:34,938 INFO success: shibd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs) 2021-07-14 07:17:34,938 INFO success: tomcat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs) 2021-07-14 07:17:34,938 INFO success: docker-gen entered RUNNING state, process has stayed up for > than 0 seconds (startsecs) 2021-07-14 07:17:34,938 INFO success: fixperm entered RUNNING state, process has stayed up for > than 0 seconds (startsecs) 2021-07-14 07:17:34,938 INFO success: mysqld entered RUNNING state, process has stayed up for > than 0 seconds (startsecs) 2021-07-14 07:17:34,938 INFO success: apache entered RUNNING state, process has stayed up for > than 0 seconds (startsecs) 2021-07-14 07:17:34,938 INFO exited: docker-gen (exit status 0; expected) 2021-07-14 07:17:34,943 CRIT reaped unknown pid 17) 2021-07-14 07:17:34,954 INFO exited: fixperm (exit status 0; expected) 2021-07-14 07:17:34,994 CRIT reaped unknown pid 58) 2021-07-14 07:17:35,643 CRIT reaped unknown pid 60) 2021-07-14 07:17:36,162 INFO exited: apache (exit status 0; expected) 2021-07-14 07:17:37,144 INFO exited: mysqld (exit status 0; expected) 2021-07-14 07:17:37,204 CRIT reaped unknown pid 584) 2021-07-14 07:17:39,356 INFO exited: shibd (exit status 0; expected) 2021-07-14 07:17:40,023 INFO exited: tomcat (exit status 1; not expected)

We are running the following docker version Client: Docker Engine - Community Version: 20.10.6 API version: 1.41 Go version: go1.13.15 Git commit: 370c289 Built: Fri Apr 9 22:45:33 2021 OS/Arch: linux/amd64 Context: default Experimental: true

Server: Docker Engine - Community Engine: Version: 20.10.6 API version: 1.41 (minimum version 1.12) Go version: go1.13.15 Git commit: 8728dd2 Built: Fri Apr 9 22:43:57 2021 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.4.4 GitCommit: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e runc: Version: 1.0.0-rc93 GitCommit: 12644e614e25b05da6fd08a38ffa0cfe1903fdec docker-init: Version: 0.19.0 GitCommit: de40ad0

We were having issues with the container version 1.4.6 which we downgraded to 1.4.4.

Thank you

carte018 commented 3 years ago

There's a lot to try and unpack here, @cvera86...

Are you using a customized config in your "master.conf" file, or are you just using the example (which is just for example purposes and won't actually work for building a functional container) config? If you run the build script with "-r -q" you're telling it to use the config you've either manually constructed or constructed with a prior run of the build script without the -r and -q flags to rebuild the container with your known-working configuration...

First thing I might want to see would be the contents of that master.config file (minus your credential secrets, obviously).

Also it would be useful to know what:

docker top epic_panini

returns, to see what's running inside your container.

I'm getting what looks like a firewall blocking access to port 443 on your host (which may be expected, since obviously I'm not inside your network), but it leads me to wonder if your EC2 instance (that does appear to be an EC2 host, based on what I can tell about it from her) has a firewall that's blocking port 443 (and probably, in that case, also the high ports from 5900 - 6000 that the Proconsul subordinate containers use for active sessions)...

From what you've given me, I'm afraid all I can tell you is that you're correct -- something isn't working. What that is is anyone's guess without more info about your configuration and what you're trying to do...

Also, I'm afraid all I have about you at the moment is that your GitHub handle is cvera86... I'm not sure if we've met before, if you're perhaps working with some of the folks I know from the NIH, or if this is your first foray into trying to interact with the Proconsul tool... Maybe I'm missing something obvious, but a little context would be really helpful, if you can offer it...

--Thanks, --Rob--

cvera86 commented 3 years ago

Hello @carte018

Sorry I didn't introduce myself. I work with Matthew and Ivan at NIAID. We use a customized master.conf file. I will talk with Matthew to see if it okay upload the file.

Yes, we are running the server in AWS. the ports we allow inbound to the servers are the high ports 5900-5999. After the build of the container we start the docker by running the following script

mkdir -p /srv/system_logs/supervisor
mkdir -p /srv/system_logs/httpd
mkdir -p /srv/system_logs/tomcat7
mkdir -p /srv/tomcat7/webapps
mkdir -p /srv/www
mkdir -p /etc/proconsul

docker run --restart=always --network=host -d -i -p 443:443 -p 3306:3306 -e TZ=America/New_York -v proconsul-log-data:/var/log -v proconsul-webapps-data:/var/lib/tomcat7/webapps -v proconsul-shibboleth-data:/etc/shibboleth -v proconsul-etc-data:/etc/proconsul -v proconsul-mysql-data:/var/lib/mysql -v proconsul-spool-data:/var/spool/docker -v proconsul-docker-gen:/opt/docker-gen -e "HOSTIP=`hostname -i`" -t carte018/proconsul

I ran the command that you mentioned above

 sudo docker top epic_panini
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                5047                5026                0                   Jul14               pts/0               00:01:10            /usr/bin/python /usr/bin/supervisord
root                5128                5047                0                   Jul14               pts/0               00:00:00            /bin/bash /usr/bin/mysqld_safe
root                5151                5047                0                   Jul14               ?                   00:00:17            /usr/sbin/apache2 -k start
33                  5154                5151                0                   Jul14               ?                   00:02:43            /usr/sbin/apache2 -k start
33                  5155                5151                0                   Jul14               ?                   00:02:41            /usr/sbin/apache2 -k start
104                 5554                5128                0                   Jul14               pts/0               00:03:58            /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
102                 5582                5047                0                   Jul14               ?                   00:15:56            /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Djava.util.logging.config.file=/var/lib/tomcat7/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.awt.headless=true -Xmx128m -XX:+UseConcMarkSweepGC -Djava.endorsed.dirs=/usr/share/tomcat7/endorsed -classpath /usr/share/tomcat7/bin/bootstrap.jar:/usr/share/tomcat7/bin/tomcat-juli.jar -Dcatalina.base=/var/lib/tomcat7 -Dcatalina.home=/usr/share/tomcat7 -Djava.io.tmpdir=/tmp/tomcat7-tomcat7-tmp org.apache.catalina.startup.Bootstrap start
103                 5672                5047                0                   Jul14               ?                   00:00:25            /usr/sbin/shibd -f -c /etc/shibboleth/shibboleth2.xml -p /var/run/shibboleth/shibd.pid -w 30

Hopefully that was hopefully and if need more details please let me know. Thank you in advance.

Chip V

carte018 commented 3 years ago

Hi, @cvera86 (Chip),

Thanks for the clarification, and sorry for not recognizing you!

Purely by way of making sure I cross all the "t"s and dot all the "i"s before we drill into more detailed stuff... You mentioned that you have ports 5900-5999 open, which is important (since those are the ports that the noVNC client talks to the subordinate Docker containers on), but you didn't mention having port 443 open to the host in Amazon. That's required, as well, since your browser will connect to the app itself on port 443 -- only after going through the app and getting a session established does the noVNC client get downloaded and connect back on a higher-numbered port. Since you mentioned you're getting an error trying just to connect to the application (rather than after going through the app to set up a connection to a specific target system), I'm thinking one possibility is that there's still a firewall blocking port 443 somewhere (either on the host itself or in the Amazon network configuration)...

It looks like everything you'd need to get to the app, at least, is running properly in the container. There's one thing missing -- the docker-gen instance that should be there to take care of cleaning up things after sessions terminate -- but we can worry about that after we're able to get your new instance accessible in general. All of the requisite processes are there -- the apache demons, the shibd, and the tomcat instance are all running, at least, and it looks like the container has been configured to externalize the ports properly.

If you have port 443 open all the way to the host and you're not getting to the app at the appropriate URL, there are a few distinct possibilities:

So, if you've got port 443 open all the way through (as far as you can tell), the next thing I'd probably want to see to start chasing down the problem is what your browser is showing you when the connection fails. If you can get me a screenshot of the browser window when it fails, that'll give me a better sense of which of the possibilities is most likely to be the one we need to focus on...

Seeing the configuration you're using will become important for debugging stuff at lower levels -- for now, though, if you're not getting a response from the server (or if you're getting some sort of 404 error, etc.) we probably need to figure out why you're not able to get to the app to begin with. Once we get you connected to the app, we can work out what's going on inside it in more detail...

--Thanks, --Rob--

cvera86 commented 3 years ago

HI Rob

So yes I did open the port to 443 and restart the build and started the container. Once I went to URL workingwith.ibrsp.org/proconsul/ allowed to me to authenticate with Saml and it took to the proconsul page. I have attached a screenshot of the page. Proconsul error

I ran docker logs 25a8523da862 and got the same error as before

/usr/lib/python2.7/dist-packages/supervisor/options.py:295: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  'Supervisord is running as root and it is searching '
2021-07-19 15:28:02,012 CRIT Supervisor running as root (no user in config file)
2021-07-19 15:28:02,012 WARN Included extra file "/etc/supervisor/conf.d/99local.conf" during parsing
2021-07-19 15:28:02,041 INFO RPC interface 'supervisor' initialized
2021-07-19 15:28:02,042 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2021-07-19 15:28:02,042 INFO supervisord started with pid 1
2021-07-19 15:28:03,044 INFO spawned: 'shibd' with pid 10
2021-07-19 15:28:03,046 INFO spawned: 'tomcat' with pid 11
2021-07-19 15:28:03,047 INFO spawned: 'docker-gen' with pid 12
2021-07-19 15:28:03,049 INFO spawned: 'fixperm' with pid 13
2021-07-19 15:28:03,050 INFO spawned: 'mysqld' with pid 15
2021-07-19 15:28:03,056 INFO spawned: 'apache' with pid 16
2021-07-19 15:28:03,069 INFO success: shibd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2021-07-19 15:28:03,069 INFO success: tomcat entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2021-07-19 15:28:03,069 INFO success: docker-gen entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2021-07-19 15:28:03,069 INFO success: fixperm entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2021-07-19 15:28:03,069 INFO success: mysqld entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2021-07-19 15:28:03,069 INFO success: apache entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2021-07-19 15:28:03,069 INFO exited: docker-gen (exit status 0; expected)
2021-07-19 15:28:03,069 INFO exited: fixperm (exit status 0; expected)
2021-07-19 15:28:03,076 CRIT reaped unknown pid 19)
2021-07-19 15:28:03,111 CRIT reaped unknown pid 56)
2021-07-19 15:28:03,771 CRIT reaped unknown pid 58)
2021-07-19 15:28:04,310 INFO exited: apache (exit status 0; expected)
2021-07-19 15:28:05,460 INFO exited: mysqld (exit status 0; expected)
2021-07-19 15:28:05,686 CRIT reaped unknown pid 597)
2021-07-19 15:28:07,465 INFO exited: shibd (exit status 0; expected)
2021-07-19 15:28:08,158 INFO exited: tomcat (exit status 1; not expected)
carte018 commented 3 years ago

Hi, @cvera86,

So, those logs are expected -- the supervisor executive logs what it sees as "processes exiting" when something demonizes (detaches from its controlling terminal). The "expected" cases are all where the demons run successfully and then detach as new PIDs. The tomcat "not expected" is an artifact of the way tomcat starts itself up and usually doesn't mean anything other than that tomcat takes longer than 1 second to complete its startup. So long as "docker top" indicates the tomcat is running, that should be fine, and given what you're seeing, your tomcat is running (else you'd not be able to get as far as you do).

That empty display looks like a case where you (as a user) aren't authorized to access anything in particular. Proconsul's not really very chatty in that case -- if it's not been configured to give you access to any specific target systems in the environment, it'll simply show you as logged in and give you a blank page. That initial screen is populated with "ribbons" that are either configured in or configured out based on the overall configuration in the Proconsul database and your personal privileges (which are in turn controlled by configuration in the database which may give you individual access to something, or may grant you access via group memberships that were released to the Proconsul instance when you Shib'd in)...

Did you expect that you have access to anything? If you do, you'd have to have entries in the database to that effect. If you've just installed the app without setting up any targets or any access rights for yourself, though, that's what I'd expect to see...

I'd check to see if, in the /etc/proconsul/proconsuladmin.properties file your ePPN (chip.vera@niaid.scienceforum.sc) has been granted admin rights (if it has, that will be listed in the file as one of the values in the "pcadminlist" property. If you're there, you can go to the "https:///admin/" URL and configure target hosts and access grants for yourself in the database. If you're not listed in the admin list, you can modify it in the container and restart the container, then try going into the admin interface...

There are other possibilities -- if the configuration's pointed at a database that isn't accessible, or if the configuration has the wrong database credentials, Proconsul won't be able to access the configuration in the database, and won't "know" what's available to you...

If you can get into the admin interface and you seem to see things that you should have access to, let me know what you're seeing in the admin interface -- I may want to get a look, at that point, at the configuration files you're working from, and get some more info to verify the AD environment you're working with matches the configuration you have in place...

There's a bunch of configuration, of course, that has to happen in the environment outside the container (the Docker demon has to be listening for TCP connections in order for the docker-gen process to properly interact with it, and all sorts of things have to be configured properly in whatever AD you're using) but assuming all that's correct, it looks as though either you haven't loaded any target/access information into the database yet, or the Proconsul instance isn't getting to the database properly...

--Thanks, --Rob--

cvera86 commented 3 years ago

Hi Rob

Wanted to say thank you for your help. Once I open the port 443 and restarted the Active Directory server. I was able to get the list of the servers. Once I click on one of the servers to connect to it fails to connect so I am trying that out figure out. THank you again for your all help.

carte018 commented 3 years ago

Oh, good -- glad it was something easy to fix once we figured out what was going on! And your'e more than welcome!

Sorry for the late reply -- I'm afraid this week has been a sea of other interrupts, so I've been remiss in keeping up with external stuff for a few days.

If I can help out with the connection failures, let me know? Getting this stuff working can be a bit like picking a lock -- there are a lot of pins that have to get into just the right positions to get everything working, and it can be hard sometimes to tell which pin is a little too far in which direction...