Closed Grinnz closed 5 months ago
I happen to have experience in building Docker Images for Web Applications.
I would love to contribute this feature to the project.
From the README.pod
file I understood that there are 3 different backends for the web site and that the site only needs one to work.
So for an inicial start-up sprint I would look into the issue to build a docker image with SQLite
which would go just into the same Container as the Mojolicious
Web Application.
I thought to use a Debian Base Image because it provides many prebuild "Perl" Libraries
I agree that a sqlite deployment would be a fine place to start. The deployment of elasticsearch nodes is wholly independent anyway.
I got the container already working on my development desktop:
$ wget -S -O /dev/null --max-redirect=0 http://127.0.0.1:3000|more
--2020-10-23 23:07:09-- http://127.0.0.1:3000/
Connecting to 127.0.0.1:3000... connected.
HTTP request sent, awaiting response...
HTTP/1.1 301 Moved Permanently
Date: Fri, 23 Oct 2020 22:07:09 GMT
Content-Length: 0
Server: Mojolicious (Perl)
Location: https://metacpan.org/pod/perl
Location: https://metacpan.org/pod/perl [following]
0 redirections exceeded.
The Log at log/development.log
documents:
$ tail log/development.log
[2020-10-23 21:57:24.81286] [2438] [info] Worker 2438 started
[2020-10-23 21:58:00.03674] [2436] [debug] [HJoRlss3] GET "/"
[2020-10-23 21:58:00.03979] [2436] [debug] [HJoRlss3] Routing to a callback
[2020-10-23 21:58:00.05269] [2436] [debug] [HJoRlss3] 301 Moved Permanently (0.015927s, 62.786/s)
[2020-10-23 22:04:36.75658] [2436] [debug] [lLDq2YlU] GET "/"
[2020-10-23 22:04:36.75720] [2436] [debug] [lLDq2YlU] Routing to a callback
[2020-10-23 22:04:36.77101] [2436] [debug] [lLDq2YlU] 301 Moved Permanently (0.014389s, 69.498/s)
[2020-10-23 22:07:09.81779] [2438] [debug] [HJoRlss3] GET "/"
[2020-10-23 22:07:09.82058] [2438] [debug] [HJoRlss3] Routing to a callback
[2020-10-23 22:07:09.83499] [2438] [debug] [HJoRlss3] 301 Moved Permanently (0.017235s, 58.021/s)
I'm wondering why it redirects to metacpan.org .
Although I tried to advance the installation of Perl Modules with previous installation in the Container Image the installation takes still 2 min to complete according to the Installation log.
$ date +"%s" > log/cpanm_install_2020-10-23.log ; cpanm -vn --installdeps --with-feature=sqlite . 2>&1 | tee -a log/cpanm_install_2020-10-23.log ; date +"%s" >> log/cpanm_install_2020-10-23.log
$ tail log/cpanm_install_2020-10-23.log|sed -n 10p
1603490068
$ cat log/cpanm_install_2020-10-23.log|sed -n 1p
1603489948
$ echo "scale=3; (1603490068-1603489948)/60"|bc -l
2.000
These 2 minutes would be consumed on each Container Start-Up.
So I'm thinking about including the cpanfile
into the Image and run the installation already at Image Build Time. Though this has the drawback that you need to rebuild the Image each time you change the cpanfile
.
I ran in the Container the command $ ./perldoc-browser.pl index all
and the Application indexed the documentation of the current Perl Version of the Container 5.28.1
:
[...]
Indexing cpanm for 5.28.1 (/usr/bin/cpanm)
[...]
Indexing cpan for 5.28.1 (/usr/bin/cpan)
[...]
I can see in the SQLite File perldoc-browser.sqlite
1158 Module Documentations inserted.
So I was expecting being able to browse the Documentation of Perl 5.28.1
like at https://perldoc.pl/5.28.1/perl
But all Calls to the Web Application redirect to metacpan.org :
$ wget -S -O /dev/null --max-redirect=0 "http://localhost:3000/5.28.1/perl"
--2020-10-25 11:56:50-- http://localhost:3000/5.28.1/perl
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:3000... connected.
HTTP request sent, awaiting response...
HTTP/1.1 301 Moved Permanently
Content-Length: 0
Location: https://metacpan.org/pod/perl
Date: Sun, 25 Oct 2020 11:56:50 GMT
Server: Mojolicious (Perl)
Location: https://metacpan.org/pod/perl [following]
0 redirections exceeded.
The Web Application documents that it was running, received the Request and was redirecting:
$ tail log/development.log
[2020-10-25 11:55:13.61065] [2354] [info] Listening at "http://*:3000"
[2020-10-25 11:55:13.61110] [2354] [info] Manager 2354 started
[2020-10-25 11:55:13.61465] [2355] [info] Worker 2355 started
[2020-10-25 11:55:13.61688] [2356] [info] Worker 2356 started
[2020-10-25 11:55:13.61891] [2357] [info] Worker 2357 started
[2020-10-25 11:55:13.62025] [2354] [info] Creating process id file "/tmp/prefork.pid"
[2020-10-25 11:55:13.62073] [2358] [info] Worker 2358 started
[2020-10-25 11:56:50.27481] [2355] [debug] [7KmeokT8] GET "/5.28.1/perl"
[2020-10-25 11:56:50.27770] [2355] [debug] [7KmeokT8] Routing to a callback
[2020-10-25 11:56:50.31721] [2355] [debug] [7KmeokT8] 301 Moved Permanently (0.042344s, 23.616/s)
Although this behaviour is quite strange. As for the Mojolicious Web and SQLite Support the Application is working in the built Image.
The Development has advanced to enable docker-compose
support and the automated cpanm
installation at container start-up at:
Docker Deployment Development
The entrypoint script entrypoint.sh
checks at start-up whether the Mojolicious
Module is installed.
It runs the cpanm
Installation when it doesn't find the Mojolicious
Module.
Since Mojolicious
isn't provided through the official Repository I think this can be a good indicator whether the Installation was already done.
That way it proceeds right away to the Application Launch if it detects the local modules and the Start-Up is fast.
# docker-compose up
Starting perldoc_web ... done
Attaching to perldoc_web
perldoc_web | Container '4d5f540e82bf.perldoc_web': 'entrypoint.sh' go ...
perldoc_web | Container '4d5f540e82bf.perldoc_web' - Network:
perldoc_web | 127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.18.0.2^I4d5f540e82bf$
perldoc_web | Command: 'perldoc-browser.pl prefork'
perldoc_web | Configuring Local Installation ...
perldoc_web | PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
perldoc_web | PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
perldoc_web | PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
perldoc_web | PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
perldoc_web | PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;
perldoc_web | Mojolicious Version: 8.63 [Code: '0']
perldoc_web | Search Backend: sqlite
perldoc_web | Service 'perldoc-browser.pl': Launching ...
perldoc_web | Web application available at http://127.0.0.1:3000
# docker-compose ps
Name Command State Ports
-----------------------------------------------------------------------------
perldoc_web entrypoint.sh perldoc-brow ... Up 0.0.0.0:3000->3000/tcp
$ tail log/development.log
[2020-11-14 14:29:36.94808] [1] [info] Listening at "http://*:3000"
[2020-11-14 14:29:36.94852] [1] [info] Manager 1 started
[2020-11-14 14:29:36.95246] [27] [info] Worker 27 started
[2020-11-14 14:29:36.95468] [28] [info] Worker 28 started
[2020-11-14 14:29:36.95704] [29] [info] Worker 29 started
[2020-11-14 14:29:36.95934] [1] [info] Creating process id file "/tmp/prefork.pid"
[2020-11-14 14:29:36.95927] [30] [info] Worker 30 started
[2020-11-14 14:31:40.97146] [27] [debug] [E2zqZSP3] GET "/"
[2020-11-14 14:31:40.97402] [27] [debug] [E2zqZSP3] Routing to a callback
[2020-11-14 14:31:41.01940] [27] [debug] [E2zqZSP3] 301 Moved Permanently (0.047901s, 20.876/s)
It also recognizes the search_backend
from the configuration file perldoc-browser.conf
to enable the right features in the cpanm
Installation.
It's important to notice that the whole Project Directory is loaded into the Container so any changes made to the files within the Container are persistently saved to disk. On the other hand also the Logs are persistently stored to disk which is nice for troubleshooting issues.
With docker-compose
the Container can now be spinned up easily with:
# docker-compose up
within the Project Directory
Also the Mojolicious
Application exits nicely and cleanly on Ctrl+C
Its behaviour is somewhat similar to the Travis-CI Test Environment only that all Changes are persistent.
So this can be a nice tool for Test::Mojo
suites executions.
That way I find that Mojolicious fits in nicely into the Container Environment since it starts in forefront and exits nicely on Ctrl+C
The only thing that is missing for a better Container Deployment is that Application Activity Logs are also written to the STDOUT
. That way it could actually be deployed as Kubernetes Container
On how to reproduce this Container Build I think it is of Common Interest for all Developers.
So I think it's better to document it in its own README.pod
file.
According to the code in /lib/PerldocBrowser/Plugin/PerldocRenderer.pm
the route /<perl_version>/search
would use the Backend to search for a given search text passed as the url parameter q
.
So the URL /5.28.1/search?q=foo
should produce results on the Web Site by correct installation and initialisation like at https://perldoc.pl/5.28.1/search?q=foo
And so running the request http://localhost:3000/5.28.1/search?q=foo
produces actually a valid HTML Page:
$ wget -S -O search_foo.html --max-redirect=0 "http://localhost:3000/5.28.1/search?q=foo"
--2020-11-24 15:29:55-- http://localhost:3000/5.28.1/search?q=foo
Resolviendo localhost (localhost)... ::1, 127.0.0.1
Conectando con localhost (localhost)[::1]:3000... conectado.
Petición HTTP enviada, esperando respuesta...
HTTP/1.1 200 OK
Date: Tue, 24 Nov 2020 15:29:56 GMT
Content-Security-Policy: default-src 'self'; connect-src 'self' www.google-analytics.com; img-src 'self' data: www.google-analytics.com www.googletagmanager.com; script-src 'self' 'unsafe-inline' cdnjs.cloudflare.com code.jquery.com stackpath.bootstrapcdn.com www.google-analytics.com www.googletagmanager.com; style-src 'self' 'unsafe-inline' cdnjs.cloudflare.com stackpath.bootstrapcdn.com; report-uri /csp-reports
Content-Type: text/html;charset=UTF-8
Server: Mojolicious (Perl)
Content-Length: 20150
Longitud: 20150 (20K) [text/html]
Grabando a: “search_foo.html”
search_foo.html 100%[==================================>] 19,68K --.-KB/s en 0s
2020-11-24 15:29:56 (137 MB/s) - “search_foo.html” guardado [20150/20150]
But in the cache file search_foo.html
I found that the results were not complete.
The Search could not find any Perl Functions for this term.
So, I looked into the Database and found that the functions
table was not populated.
This documentation is distributed in a different package as perl-doc
in Debian .
Adding this Package and reindexing the database produces now the expected results
for the Request http://localhost:3000/5.28.1/perl
$ wget -S -O perl_index.html --max-redirect=0 "http://localhost:3000/5.28.1/perl"
--2020-11-24 16:32:52-- http://localhost:3000/5.28.1/perl
Resolviendo localhost (localhost)... ::1, 127.0.0.1
Conectando con localhost (localhost)[::1]:3000... conectado.
Petición HTTP enviada, esperando respuesta...
HTTP/1.1 200 OK
Content-Security-Policy: default-src 'self'; connect-src 'self' www.google-analytics.com; img-src 'self' data: www.google-analytics.com www.googletagmanager.com; script-src 'self' 'unsafe-inline' cdnjs.cloudflare.com code.jquery.com stackpath.bootstrapcdn.com www.google-analytics.com www.googletagmanager.com; style-src 'self' 'unsafe-inline' cdnjs.cloudflare.com stackpath.bootstrapcdn.com; report-uri /csp-reports
Server: Mojolicious (Perl)
Content-Length: 37073
Content-Type: text/html;charset=UTF-8
Date: Tue, 24 Nov 2020 16:32:52 GMT
Longitud: 37073 (36K) [text/html]
Grabando a: “perl_index.html”
perl_index.html 100%[==================================>] 36,20K --.-KB/s en 0s
2020-11-24 16:32:52 (237 MB/s) - “perl_index.html” guardado [37073/37073]
which is documented like this in the Web Service Log:
[2020-11-24 16:32:52.36986] [22] [debug] [b0G-OsON] GET "/5.28.1/perl"
[2020-11-24 16:32:52.37060] [22] [debug] [b0G-OsON] Routing to a callback
[2020-11-24 16:32:52.58280] [22] [debug] [b0G-OsON] Rendering cached template "perldoc.html.ep"
[2020-11-24 16:32:52.58383] [22] [debug] [b0G-OsON] Rendering cached template "menubar.html.ep"
[2020-11-24 16:32:52.58968] [22] [debug] [b0G-OsON] 200 OK (0.219804s, 4.550/s)
After having concluded the first Sprint I will work on the next Sprint which will be the integration with the PostgreSQL Service in a separate container. I looked at the image of PostgreSQL at the docker repository: Official Docker Image Dockerfile but I found they would compile it from source. I rather like the cleaner approach of installing it from the Main Stream Repositories as it is shown in the docker documentation article: Deploy Postgres on Docker
Thinking of a possible Use Case of this development as to create a serverless testing environment for Automated Tests as described on:
GitHub Action Service Containers
the setup becomes easier using prebuilt publically available images.
So the Postgres Image for Alpine Linux can be easily configured as described at:
Postgres Alpine Image with docker-compose
The Image Documentation describes at:
container configuration with Environment Variables
that the Environment Variables POSTGRES_USER
and POSTGRES_PASSWORD
are used to setup the database.
As the documentation explains an user account perldoc
is created with a matching database perldoc
.
As persistent Database Storage Location the data
directory was assigned.
Unfortunately this downside implications that the common developer user account can't build the Web Site Docker Image since the data
directory will be owned by the postgres
User Account:
$ docker-compose up --build web
Creating network "perldoc_web_default" with the default driver
Building web
Traceback (most recent call last):
File "/usr/bin/docker-compose", line 11, in <module>
load_entry_point('docker-compose==1.21.0', 'console_scripts', 'docker-compose')()
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 71, in main
command()
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 127, in perform_command
handler(command, command_options)
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1052, in up
to_attach = up(False)
File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1048, in up
silent=options.get('--quiet-pull'),
File "/usr/lib/python3/dist-packages/compose/project.py", line 466, in up
svc.ensure_image_exists(do_build=do_build, silent=silent)
File "/usr/lib/python3/dist-packages/compose/service.py", line 314, in ensure_image_exists
self.build()
File "/usr/lib/python3/dist-packages/compose/service.py", line 1027, in build
platform=platform,
File "/usr/lib/python3/dist-packages/docker/api/build.py", line 154, in build
path, exclude=exclude, dockerfile=dockerfile, gzip=gzip
File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 30, in tar
files=sorted(exclude_paths(root, exclude, dockerfile=dockerfile[0])),
File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 49, in exclude_paths
return set(pm.walk(root))
File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 214, in rec_walk
for sub in rec_walk(cur):
File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 184, in rec_walk
for f in os.listdir(current_dir):
PermissionError: [Errno 13] Permission denied: '/absolute/path/to/project/data'
Still the web
Image can be built as root
user on the Docker Host.
But to resolve this limitation a custom database image would be need that would change the User ID of the postgres
user to match the Development User Account ID
As documented in the Official Docker Documentation
excluding files with .dockerignore
the .dockerignore
file can avoid this conflict and also speed up the build excluding the extensive .git
directory from the build context.
Now common users with access to the docker
service and build the image.
Now the Docker Cluster consists of 2 Containers:
$ docker-compose up -d db
Creating network "perldoc_web_default" with the default driver
Creating perldoc_db ... done
$ docker-compose ps
Name Command State Ports
---------------------------------------------------------------------------
perldoc_db docker-entrypoint.sh postgres Up 0.0.0.0:5432->5432/tcp
$ docker-compose up -d web
Creating perldoc_web ... done
$ docker-compose ps
Name Command State Ports
-----------------------------------------------------------------------------
perldoc_db docker-entrypoint.sh postgres Up 0.0.0.0:5432->5432/tcp
perldoc_web entrypoint.sh perldoc-brow ... Up 0.0.0.0:3000->3000/tcp
$ docker-compose down
Stopping perldoc_web ... done
Stopping perldoc_db ... done
Removing perldoc_web ... done
Removing perldoc_db ... done
Removing network perldoc_web_default
To access the database container the configuration perldoc-browser.conf
needs to be changed to:
{
# [...]
pg => 'postgresql://perldoc:secret@db/perldoc',
# [...]
search_backend => 'pg',
# [...]
}
within the Docker Cluster the container perldoc_db
is known as just db
.
the docker-compose
configuration exposes also the Port 5432 so the database can also be accessed from the Docker Host with localhost:5432
.
But to use this configuration the perldoc-browser.pl
must be run with docker-compose
.
The Perl Module Installation is done as local user only installation as it is common in serverless Environments.
So it is convienent to use the entrypoint.sh
Script to launch the Database Initialization:
$ docker-compose up -d
Starting perldoc_web ... done
Starting perldoc_db ... done
$ docker-compose ps
Name Command State Ports
-----------------------------------------------------------------------------
perldoc_db docker-entrypoint.sh postgres Up 0.0.0.0:5432->5432/tcp
perldoc_web entrypoint.sh perldoc-brow ... Up 0.0.0.0:3000->3000/tcp
$ docker-compose exec web entrypoint.sh perldoc-browser.pl index all
Container 'b2046f460754.perldoc_web': 'entrypoint.sh' go ...
Container 'b2046f460754.perldoc_web' - Network:
127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.20.0.2^Ib2046f460754$
Command: 'perldoc-browser.pl index all'
Configuring Local Installation ...
PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;
Mojolicious Version: 9.21 [Code: '0']
Search Backend: pg
Mojo::Pg Version: 4.25 [Code: '0']
Service 'perldoc-browser.pl': Launching ...
# [...]
Here the important outputs are:
Search Backend: pg
Mojo::Pg Version: 4.25 [Code: '0']
that indicate that the PostgreSQL backend is recognized and the Perl modules are installed.
Now the web site cannot run without the database backend because the database host db
cannot be resolved.
$ docker-compose up web
# [...]
perldoc_web | Search Backend: pg
perldoc_web | Mojo::Pg Version: 4.25 [Code: '0']
perldoc_web | Service 'perldoc-browser.pl': Launching ...
perldoc_web | DBI connect('dbname=perldoc;host=db','perldoc',...) failed: could not translate host name "db" to address: Name or service not known at /home/perldoc-browser/perl5/lib/perl5/Mojo/Pg.pm line 73.
perldoc_web exited with code 255
Now the last sprint involves introducing the ElasticSearch container in the cluster.
The cpanfile
documents that the ElasticSearch client is for the 6.X series:
feature 'es', 'Elasticsearch search backend', sub {
requires 'Search::Elasticsearch' => '6.00';
requires 'Search::Elasticsearch::Client::6_0';
requires 'Log::Any::Adapter::MojoLog';
};
So, the latest version of this series is 6.8 which corresponds to the blacktop/elasticsearch:6.8
image.
Within the cluster the ElasticSearch component will be known as elasticsearch
which must be configured in the perldoc-browser.conf
configuration file.
The ElasticSearch data directory will be located in data/es/
and the PostgreSQL data directory will be moved to data/pg/
.
Initially those directories will be provided by the git repository but to make the PostgreSQL directory accessible for the PostgreSQL container its data directory needs to be changed to be owned by the postgres
(uid: 70
) user:
# chown 70:70 data/pg
# cd data
# pwd
/absolute/path/to/project/data
# chmod a+rx pg
# ls -lah pg
total 60K
drwxr-xr-x 19 70 70 4,0K nov 13 11:34 .
drwxr-xr-x 4 bodo bodo 60 nov 13 11:22 ..
drwxr-xr-x 6 70 70 54 ago 21 19:18 base
drwxr-xr-x 2 70 70 4,0K nov 13 11:35 global
-rw-r--r-- 1 bodo bodo 0 nov 11 19:29 .keep
drwxr-xr-x 2 70 70 6 ago 21 19:18 pg_commit_ts
drwxr-xr-x 2 70 70 6 ago 21 19:18 pg_dynshmem
-rw-r--r-- 1 70 70 4,5K ago 21 19:18 pg_hba.conf
-rw-r--r-- 1 70 70 1,6K ago 21 19:18 pg_ident.conf
drwxr-xr-x 4 70 70 68 nov 13 11:39 pg_logical
drwxr-xr-x 4 70 70 36 ago 21 19:18 pg_multixact
drwxr-xr-x 2 70 70 18 nov 13 11:34 pg_notify
drwxr-xr-x 2 70 70 6 ago 21 19:18 pg_replslot
drwxr-xr-x 2 70 70 6 ago 21 19:18 pg_serial
drwxr-xr-x 2 70 70 6 ago 21 19:18 pg_snapshots
drwxr-xr-x 2 70 70 6 nov 13 11:34 pg_stat
drwxr-xr-x 2 70 70 63 nov 13 11:56 pg_stat_tmp
drwxr-xr-x 2 70 70 18 ago 21 19:18 pg_subtrans
drwxr-xr-x 2 70 70 6 ago 21 19:18 pg_tblspc
drwxr-xr-x 2 70 70 6 ago 21 19:18 pg_twophase
-rw-r--r-- 1 70 70 3 ago 21 19:18 PG_VERSION
drwxr-xr-x 3 70 70 220 ago 22 12:57 pg_wal
drwxr-xr-x 2 70 70 18 ago 21 19:18 pg_xact
-rw-r--r-- 1 70 70 88 ago 21 19:18 postgresql.auto.conf
-rw-r--r-- 1 70 70 24K ago 21 19:18 postgresql.conf
-rw-r--r-- 1 70 70 24 nov 13 11:34 postmaster.opts
-rw------- 1 70 70 94 nov 13 11:34 postmaster.pid
In order to commit the file data/pg/.keep
the data/pg/
directory needed to be changed to be accessible after each Postgres launch because Postgres will modify the access on each launch to restrict access.
On fresh install if the Postgres data directory is not changed to be accessible by the postgres
user the database will fail to start up:
The container will show as failed in the docker-compose
status report.
$ docker-compose ps
Name Command State Ports
----------------------------------------------------------------------------------------
perldoc_db docker-entrypoint.sh postgres Exit 1
perldoc_es /elastic-entrypoint.sh ela ... Up 0.0.0.0:9200->9200/tcp, 9300/tcp
perldoc_web entrypoint.sh perldoc-brow ... Up 0.0.0.0:3000->3000/tcp
$ docker-compose up db
Creating network "perldoc_web_default" with the default driver
Creating perldoc_db ... done
Attaching to perldoc_db
perldoc_db | chmod: /var/lib/postgresql/data: Operation not permitted
perldoc_db | chmod: /var/run/postgresql: Operation not permitted
perldoc_db | initdb: could not look up effective user ID 1000: user does not exist
perldoc_db exited with code 1
On initial install the ElasticSearch container is unable to start up because of default memory restrictions:
$ docker-compose ps
Name Command State Ports
-------------------------------------------------------------------------------
perldoc_db docker-entrypoint.sh postgres Exit 1
perldoc_es /elastic-entrypoint.sh ela ... Exit 78
perldoc_web entrypoint.sh perldoc-brow ... Up 0.0.0.0:3000->3000/tcp
$ docker logs perldoc_es
warning: Falling back to java on path. This behavior is deprecated. Specify JAVA_HOME
[2021-11-13T09:49:48,221][WARN ][o.e.c.l.LogConfigurator ] [unknown] Some logging configurations have %marker but don't have %node_name. We will automatically add %node_name to the pattern to ease the migration for users who customize log4j2.properties but will stop this behavior in 7.0. You should manually replace `%node_name` with `[%node_name]%marker ` in these locations:
/usr/share/elasticsearch/config/log4j2.properties
[2021-11-13T09:49:50,449][INFO ][o.e.e.NodeEnvironment ] [Evou766] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/mapper/vg_laptop--bodo-lv_home)]], net usable_space [175.4gb], net total_space [186.1gb], types [xfs]
[2021-11-13T09:49:50,450][INFO ][o.e.e.NodeEnvironment ] [Evou766] heap size [1007.3mb], compressed ordinary object pointers [true]
[2021-11-13T09:49:50,453][INFO ][o.e.n.Node ] [Evou766] node name derived from node ID [Evou766tQwqgAOXMj5emuw]; set [node.name] to override
[2021-11-13T09:49:50,454][INFO ][o.e.n.Node ] [Evou766] version[6.8.13], pid[1], build[oss/tar/be13c69/2020-10-16T09:09:46.555371Z], OS[Linux/4.19.0-18-amd64/amd64], JVM[IcedTea/OpenJDK 64-Bit Server VM/1.8.0_242/25.242-b08]
[2021-11-13T09:49:50,454][INFO ][o.e.n.Node ] [Evou766] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/usr/share/elasticsearch/tmp, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.cgroups.hierarchy.override=/, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=oss, -Des.distribution.type=tar]
[2021-11-13T09:49:56,930][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [aggs-matrix-stats]
[2021-11-13T09:49:56,930][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [analysis-common]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [ingest-common]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [ingest-geoip]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [ingest-user-agent]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [lang-expression]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [lang-mustache]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [lang-painless]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [mapper-extras]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [parent-join]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [percolator]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [rank-eval]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [reindex]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [repository-url]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [transport-netty4]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService ] [Evou766] loaded module [tribe]
[2021-11-13T09:49:56,933][INFO ][o.e.p.PluginsService ] [Evou766] no plugins loaded
[2021-11-13T09:50:13,538][INFO ][o.e.d.DiscoveryModule ] [Evou766] using discovery type [zen] and host providers [settings]
[2021-11-13T09:50:15,808][INFO ][o.e.n.Node ] [Evou766] initialized
[2021-11-13T09:50:15,809][INFO ][o.e.n.Node ] [Evou766] starting ...
[2021-11-13T09:50:16,474][INFO ][o.e.t.TransportService ] [Evou766] publish_address {172.19.0.4:9300}, bound_addresses {0.0.0.0:9300}
[2021-11-13T09:50:16,526][INFO ][o.e.b.BootstrapChecks ] [Evou766] bound or publishing to a non-loopback address, enforcing bootstrap checks
ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2021-11-13T09:50:16,550][INFO ][o.e.n.Node ] [Evou766] stopping ...
[2021-11-13T09:50:16,634][INFO ][o.e.n.Node ] [Evou766] stopped
[2021-11-13T09:50:16,634][INFO ][o.e.n.Node ] [Evou766] closing ...
[2021-11-13T09:50:16,678][INFO ][o.e.n.Node ] [Evou766] closed
So, the memory limit must be increased as root
user:
# sysctl -w vm.max_map_count=262144
vm.max_map_count = 262144
There is a great documentation about the effects that configuration has on the Docker Host System published by the Suse Distribution at Documentation on vm.max_map_count
To use the ElasticSearch search backend this must be configured in the perldoc-browser.conf
configuration file:
es => 'http://elasticsearch:9200',
search_backend => 'es',
Before the search backend is usable the indexation script must be run with:
$ docker-compose exec web entrypoint.sh perldoc-browser.pl index all
Container 'db3095cd2a39.perldoc_web': 'entrypoint.sh' go ...
Container 'db3095cd2a39.perldoc_web' - Network:
127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.19.0.4^Idb3095cd2a39$
Command: 'perldoc-browser.pl index all'
Configuring Local Installation ...
PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;
Mojolicious Version: 9.21 [Code: '0']
Search Backend: es
Search::Elasticsearch Version: 7.715 [Code: '0']
Service 'perldoc-browser.pl': Launching ...
[2021-11-13 10:11:18.25728] [26] [info] Current cxns: ["http://elasticsearch:9200"]
[2021-11-13 10:11:18.25758] [26] [info] Forcing ping before next use on all live cxns
[2021-11-13 10:11:18.25780] [26] [info] Ping [http://elasticsearch:9200] before next request
[2021-11-13 10:11:30.97793] [26] [info] Pinging [http://elasticsearch:9200]
[2021-11-13 10:11:31.29653] [26] [info] Marking [http://elasticsearch:9200] as live
# [...]
Swapping faqs_5.28.1 index(es) => faqs_5.28.1_1636798290
Swapping pods_5.28.1 index(es) => pods_5.28.1_1636798290
Swapping functions_5.28.1 index(es) => functions_5.28.1_1636798290
Swapping perldeltas_5.28.1 index(es) => perldeltas_5.28.1_1636798290
Swapping variables_5.28.1 index(es) => variables_5.28.1_1636798290
After the indexation the ElasticSearch component has 5 indices for each Perl version:
$ curl -v http://localhost:9200/_cat/indices
# [...]
* Trying ::1...
* TCP_NODELAY set
* Expire in 149997 ms for 3 (transfer 0x55f60a399e20)
* Expire in 200 ms for 4 (transfer 0x55f60a399e20)
* Connected to localhost (::1) port 9200 (#0)
> GET /_cat/indices HTTP/1.1
> Host: localhost:9200
> User-Agent: curl/7.64.0
> Accept: */*
>
< HTTP/1.1 200 OK
< content-type: text/plain; charset=UTF-8
< content-length: 455
<
yellow open perldeltas_5.28.1_1636798290 BTS4QdaeQk6OJLFnyYUI9g 1 1 2164 0 3.2mb 3.2mb
yellow open faqs_5.28.1_1636798290 gyrqSq7mQrKXzAmQJ4cGVA 1 1 305 0 784.9kb 784.9kb
yellow open variables_5.28.1_1636798290 wjDlOrQrRaWb77HTKhdA5Q 1 1 150 0 17.1kb 17.1kb
yellow open pods_5.28.1_1636798290 PJ-EZ0IbQb67EOzkGrVj1w 1 1 1579 0 23.2mb 23.2mb
yellow open functions_5.28.1_1636798290 xzukrTriSNWiyPqKMpZU4w 1 1 292 0 570.5kb 570.5kb
* Connection #0 to host localhost left intact
and the "foo" search will produce a complete search result:
$ curl -v -o search_foo.html "http://localhost:3000/5.28.1/search?q=foo"
# [...]
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Expire in 0 ms for 1 (transfer 0x55a2ab834e20)
# [...]
* Trying ::1...
* TCP_NODELAY set
* Expire in 149998 ms for 3 (transfer 0x55a2ab834e20)
* Expire in 200 ms for 4 (transfer 0x55a2ab834e20)
* Connected to localhost (::1) port 3000 (#0)
> GET /5.28.1/search?q=foo HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.64.0
> Accept: */*
>
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0< HTTP/1.1 200 OK
< Date: Sat, 13 Nov 2021 11:04:38 GMT
< Content-Length: 52296
< Content-Security-Policy: default-src 'self'; connect-src 'self' www.google-analytics.com; img-src 'self' data: www.google-analytics.com www.googletagmanager.com; script-src 'self' 'unsafe-inline' cdnjs.cloudflare.com code.jquery.com stackpath.bootstrapcdn.com www.google-analytics.com www.googletagmanager.com; style-src 'self' 'unsafe-inline' cdnjs.cloudflare.com stackpath.bootstrapcdn.com; report-uri /csp-reports
< Content-Type: text/html;charset=UTF-8
< Server: Mojolicious (Perl)
<
{ [52296 bytes data]
100 52296 100 52296 0 0 36596 0 0:00:01 0:00:01 --:--:-- 36596
* Connection #0 to host localhost left intact
Before running the indexation it is important to check whether the ElasticSearch component is already ready for service with:
$ curl -v http://localhost:9200
# [...]
* Trying ::1...
* TCP_NODELAY set
* Expire in 150000 ms for 3 (transfer 0x56117c89ae20)
* Expire in 200 ms for 4 (transfer 0x56117c89ae20)
* Connected to localhost (::1) port 9200 (#0)
> GET / HTTP/1.1
> Host: localhost:9200
> User-Agent: curl/7.64.0
> Accept: */*
>
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 490
<
{
"name" : "Evou766",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "D7p_jR1TQBeK7J69Hk3QRg",
"version" : {
"number" : "6.8.13",
"build_flavor" : "oss",
"build_type" : "tar",
"build_hash" : "be13c69",
"build_date" : "2020-10-16T09:09:46.555371Z",
"build_snapshot" : false,
"lucene_version" : "7.7.3",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
* Connection #0 to host localhost left intact
otherwise the Search::Elasticsearch::Role::Cxn
Role will produce a NoNodes
Exception and the indexation will fail:
$ docker-compose up -d
Creating network "perldoc_web_default" with the default driver
Creating perldoc_es ... done
Creating perldoc_db ... done
Creating perldoc_web ... done
$ docker-compose exec web entrypoint.sh perldoc-browser.pl index all
Container 'de674593309b.perldoc_web': 'entrypoint.sh' go ...
Container 'de674593309b.perldoc_web' - Network:
127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.25.0.3^Ide674593309b$
Command: 'perldoc-browser.pl index all'
Configuring Local Installation ...
PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;
Mojolicious Version: 9.21 [Code: '0']
Search Backend: es
Search::Elasticsearch Version: 7.715 [Code: '0']
Service 'perldoc-browser.pl': Launching ...
[2021-11-13 12:52:59.01476] [12] [info] Current cxns: ["http://elasticsearch:9200"]
[2021-11-13 12:52:59.01538] [12] [info] Forcing ping before next use on all live cxns
[2021-11-13 12:52:59.01664] [12] [info] Ping [http://elasticsearch:9200] before next request
[2021-11-13 12:53:41.98002] [12] [info] Pinging [http://elasticsearch:9200]
[2021-11-13 12:53:44.37398] [12] [debug] [Cxn] ** [http://elasticsearch:9200]-[599] Could not connect to 'elasticsearch:9200': Connection refused, called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at /home/perldoc-browser/lib/PerldocBrowser/Plugin/PerldocSearch/Elastic.pm line 352. With vars: {'status_code' => 599,'request' => {'method' => 'HEAD','timeout' => 2,'path' => '/'}}
[2021-11-13 12:53:44.37455] [12] [info] Marking [http://elasticsearch:9200] as dead. Next ping at: Sat Nov 13 12:54:44 2021
[2021-11-13 12:53:44.39780] [12] [fatal] [NoNodes] ** No nodes are available: [http://elasticsearch:9200], called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at /home/perldoc-browser/lib/PerldocBrowser/Plugin/PerldocSearch/Elastic.pm line 352.
[NoNodes] ** No nodes are available: [http://elasticsearch:9200], called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at /home/perldoc-browser/lib/PerldocBrowser/Plugin/PerldocSearch
Because the ElasticSearch is known to be slow to start up which is also documented at: ElasticSearch slow Startup produces Index Corruption
Reproducing the build on a new system it turns out that no the ownership of the ProgreSQL data directory is an obstacle but the committed file data/pg/.keep
inside of it:
perldoc_db | initdb: directory "/var/lib/postgresql/data" exists but is not empty
perldoc_db | It contains a dot-prefixed/invisible file, perhaps due to it being a mount point.
perldoc_db exited with code 1
So the data/pg/.keep
file must be removed to make the database installation work.
The ownership of the data directory will be changed automatically during the installation.
I don't have any experience with Docker, but it would be nice for other people to be able to easily deploy instances. There are two components: the web app (standard Mojolicious application deployed with hypnotoad), and the elasticsearch server (version 6, needs at least 2 CPUs and 2GB RAM per node).