Grinnz / perldoc-browser

Perldoc Browser
https://perldoc.perl.org
Artistic License 2.0
55 stars 21 forks source link

Dockerize deployment #26

Closed Grinnz closed 5 months ago

Grinnz commented 4 years ago

I don't have any experience with Docker, but it would be nice for other people to be able to easily deploy instances. There are two components: the web app (standard Mojolicious application deployed with hypnotoad), and the elasticsearch server (version 6, needs at least 2 CPUs and 2GB RAM per node).

bodo-hugo-barwich commented 4 years ago

I happen to have experience in building Docker Images for Web Applications. I would love to contribute this feature to the project. From the README.pod file I understood that there are 3 different backends for the web site and that the site only needs one to work. So for an inicial start-up sprint I would look into the issue to build a docker image with SQLite which would go just into the same Container as the Mojolicious Web Application. I thought to use a Debian Base Image because it provides many prebuild "Perl" Libraries

Grinnz commented 4 years ago

I agree that a sqlite deployment would be a fine place to start. The deployment of elasticsearch nodes is wholly independent anyway.

bodo-hugo-barwich commented 4 years ago

I got the container already working on my development desktop:

$ wget -S -O /dev/null --max-redirect=0 http://127.0.0.1:3000|more
--2020-10-23 23:07:09--  http://127.0.0.1:3000/
Connecting to 127.0.0.1:3000... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 301 Moved Permanently
  Date: Fri, 23 Oct 2020 22:07:09 GMT
  Content-Length: 0
  Server: Mojolicious (Perl)
  Location: https://metacpan.org/pod/perl
Location: https://metacpan.org/pod/perl [following]
0 redirections exceeded.

The Log at log/development.log documents:

$ tail log/development.log
[2020-10-23 21:57:24.81286] [2438] [info] Worker 2438 started
[2020-10-23 21:58:00.03674] [2436] [debug] [HJoRlss3] GET "/"
[2020-10-23 21:58:00.03979] [2436] [debug] [HJoRlss3] Routing to a callback
[2020-10-23 21:58:00.05269] [2436] [debug] [HJoRlss3] 301 Moved Permanently (0.015927s, 62.786/s)
[2020-10-23 22:04:36.75658] [2436] [debug] [lLDq2YlU] GET "/"
[2020-10-23 22:04:36.75720] [2436] [debug] [lLDq2YlU] Routing to a callback
[2020-10-23 22:04:36.77101] [2436] [debug] [lLDq2YlU] 301 Moved Permanently (0.014389s, 69.498/s)
[2020-10-23 22:07:09.81779] [2438] [debug] [HJoRlss3] GET "/"
[2020-10-23 22:07:09.82058] [2438] [debug] [HJoRlss3] Routing to a callback
[2020-10-23 22:07:09.83499] [2438] [debug] [HJoRlss3] 301 Moved Permanently (0.017235s, 58.021/s)

I'm wondering why it redirects to metacpan.org .

Although I tried to advance the installation of Perl Modules with previous installation in the Container Image the installation takes still 2 min to complete according to the Installation log.

$ date +"%s" > log/cpanm_install_2020-10-23.log ; cpanm -vn --installdeps --with-feature=sqlite . 2>&1 | tee -a log/cpanm_install_2020-10-23.log ; date +"%s" >> log/cpanm_install_2020-10-23.log
$ tail log/cpanm_install_2020-10-23.log|sed -n 10p
1603490068
$ cat log/cpanm_install_2020-10-23.log|sed -n 1p
1603489948
$ echo "scale=3; (1603490068-1603489948)/60"|bc -l
2.000

These 2 minutes would be consumed on each Container Start-Up.

So I'm thinking about including the cpanfile into the Image and run the installation already at Image Build Time. Though this has the drawback that you need to rebuild the Image each time you change the cpanfile.

bodo-hugo-barwich commented 4 years ago

I ran in the Container the command $ ./perldoc-browser.pl index all and the Application indexed the documentation of the current Perl Version of the Container 5.28.1:

[...]
Indexing cpanm for 5.28.1 (/usr/bin/cpanm)
[...]
Indexing cpan for 5.28.1 (/usr/bin/cpan)
[...]

I can see in the SQLite File perldoc-browser.sqlite 1158 Module Documentations inserted. So I was expecting being able to browse the Documentation of Perl 5.28.1 like at https://perldoc.pl/5.28.1/perl But all Calls to the Web Application redirect to metacpan.org :

$ wget -S -O /dev/null --max-redirect=0 "http://localhost:3000/5.28.1/perl"
--2020-10-25 11:56:50--  http://localhost:3000/5.28.1/perl
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:3000... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 301 Moved Permanently
  Content-Length: 0
  Location: https://metacpan.org/pod/perl
  Date: Sun, 25 Oct 2020 11:56:50 GMT
  Server: Mojolicious (Perl)
Location: https://metacpan.org/pod/perl [following]
0 redirections exceeded.

The Web Application documents that it was running, received the Request and was redirecting:

$ tail log/development.log
[2020-10-25 11:55:13.61065] [2354] [info] Listening at "http://*:3000"
[2020-10-25 11:55:13.61110] [2354] [info] Manager 2354 started
[2020-10-25 11:55:13.61465] [2355] [info] Worker 2355 started
[2020-10-25 11:55:13.61688] [2356] [info] Worker 2356 started
[2020-10-25 11:55:13.61891] [2357] [info] Worker 2357 started
[2020-10-25 11:55:13.62025] [2354] [info] Creating process id file "/tmp/prefork.pid"
[2020-10-25 11:55:13.62073] [2358] [info] Worker 2358 started
[2020-10-25 11:56:50.27481] [2355] [debug] [7KmeokT8] GET "/5.28.1/perl"
[2020-10-25 11:56:50.27770] [2355] [debug] [7KmeokT8] Routing to a callback
[2020-10-25 11:56:50.31721] [2355] [debug] [7KmeokT8] 301 Moved Permanently (0.042344s, 23.616/s)

Although this behaviour is quite strange. As for the Mojolicious Web and SQLite Support the Application is working in the built Image.

bodo-hugo-barwich commented 4 years ago

The Development has advanced to enable docker-compose support and the automated cpanm installation at container start-up at: Docker Deployment Development The entrypoint script entrypoint.sh checks at start-up whether the Mojolicious Module is installed. It runs the cpanm Installation when it doesn't find the Mojolicious Module. Since Mojolicious isn't provided through the official Repository I think this can be a good indicator whether the Installation was already done. That way it proceeds right away to the Application Launch if it detects the local modules and the Start-Up is fast.

# docker-compose up
Starting perldoc_web ... done
Attaching to perldoc_web
perldoc_web | Container '4d5f540e82bf.perldoc_web': 'entrypoint.sh' go ...
perldoc_web | Container '4d5f540e82bf.perldoc_web' - Network:
perldoc_web | 127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.18.0.2^I4d5f540e82bf$
perldoc_web | Command: 'perldoc-browser.pl prefork'
perldoc_web | Configuring Local Installation ...
perldoc_web | PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
perldoc_web | PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
perldoc_web | PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
perldoc_web | PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
perldoc_web | PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;
perldoc_web | Mojolicious Version: 8.63 [Code: '0']
perldoc_web | Search Backend: sqlite
perldoc_web | Service 'perldoc-browser.pl': Launching ...
perldoc_web | Web application available at http://127.0.0.1:3000
# docker-compose ps
   Name                  Command               State           Ports         
-----------------------------------------------------------------------------
perldoc_web   entrypoint.sh perldoc-brow ...   Up      0.0.0.0:3000->3000/tcp
$ tail log/development.log
[2020-11-14 14:29:36.94808] [1] [info] Listening at "http://*:3000"
[2020-11-14 14:29:36.94852] [1] [info] Manager 1 started
[2020-11-14 14:29:36.95246] [27] [info] Worker 27 started
[2020-11-14 14:29:36.95468] [28] [info] Worker 28 started
[2020-11-14 14:29:36.95704] [29] [info] Worker 29 started
[2020-11-14 14:29:36.95934] [1] [info] Creating process id file "/tmp/prefork.pid"
[2020-11-14 14:29:36.95927] [30] [info] Worker 30 started
[2020-11-14 14:31:40.97146] [27] [debug] [E2zqZSP3] GET "/"
[2020-11-14 14:31:40.97402] [27] [debug] [E2zqZSP3] Routing to a callback
[2020-11-14 14:31:41.01940] [27] [debug] [E2zqZSP3] 301 Moved Permanently (0.047901s, 20.876/s)

It also recognizes the search_backend from the configuration file perldoc-browser.conf to enable the right features in the cpanm Installation.

It's important to notice that the whole Project Directory is loaded into the Container so any changes made to the files within the Container are persistently saved to disk. On the other hand also the Logs are persistently stored to disk which is nice for troubleshooting issues.

With docker-compose the Container can now be spinned up easily with:

# docker-compose up

within the Project Directory Also the Mojolicious Application exits nicely and cleanly on Ctrl+C Its behaviour is somewhat similar to the Travis-CI Test Environment only that all Changes are persistent.

So this can be a nice tool for Test::Mojo suites executions.

That way I find that Mojolicious fits in nicely into the Container Environment since it starts in forefront and exits nicely on Ctrl+C The only thing that is missing for a better Container Deployment is that Application Activity Logs are also written to the STDOUT. That way it could actually be deployed as Kubernetes Container

bodo-hugo-barwich commented 4 years ago

On how to reproduce this Container Build I think it is of Common Interest for all Developers. So I think it's better to document it in its own README.pod file.

bodo-hugo-barwich commented 3 years ago

According to the code in /lib/PerldocBrowser/Plugin/PerldocRenderer.pm the route /<perl_version>/search would use the Backend to search for a given search text passed as the url parameter q. So the URL /5.28.1/search?q=foo should produce results on the Web Site by correct installation and initialisation like at https://perldoc.pl/5.28.1/search?q=foo And so running the request http://localhost:3000/5.28.1/search?q=foo produces actually a valid HTML Page:

$ wget -S -O search_foo.html --max-redirect=0 "http://localhost:3000/5.28.1/search?q=foo"
--2020-11-24 15:29:55--  http://localhost:3000/5.28.1/search?q=foo
Resolviendo localhost (localhost)... ::1, 127.0.0.1
Conectando con localhost (localhost)[::1]:3000... conectado.
Petición HTTP enviada, esperando respuesta... 
  HTTP/1.1 200 OK
  Date: Tue, 24 Nov 2020 15:29:56 GMT
  Content-Security-Policy: default-src 'self'; connect-src 'self' www.google-analytics.com; img-src 'self' data: www.google-analytics.com www.googletagmanager.com; script-src 'self' 'unsafe-inline' cdnjs.cloudflare.com code.jquery.com stackpath.bootstrapcdn.com www.google-analytics.com www.googletagmanager.com; style-src 'self' 'unsafe-inline' cdnjs.cloudflare.com stackpath.bootstrapcdn.com; report-uri /csp-reports
  Content-Type: text/html;charset=UTF-8
  Server: Mojolicious (Perl)
  Content-Length: 20150
Longitud: 20150 (20K) [text/html]
Grabando a: “search_foo.html”

search_foo.html          100%[==================================>]  19,68K  --.-KB/s    en 0s      

2020-11-24 15:29:56 (137 MB/s) - “search_foo.html” guardado [20150/20150]

But in the cache file search_foo.html I found that the results were not complete. The Search could not find any Perl Functions for this term. So, I looked into the Database and found that the functions table was not populated. This documentation is distributed in a different package as perl-doc in Debian . Adding this Package and reindexing the database produces now the expected results for the Request http://localhost:3000/5.28.1/perl

$ wget -S -O perl_index.html --max-redirect=0 "http://localhost:3000/5.28.1/perl"
--2020-11-24 16:32:52--  http://localhost:3000/5.28.1/perl
Resolviendo localhost (localhost)... ::1, 127.0.0.1
Conectando con localhost (localhost)[::1]:3000... conectado.
Petición HTTP enviada, esperando respuesta... 
  HTTP/1.1 200 OK
  Content-Security-Policy: default-src 'self'; connect-src 'self' www.google-analytics.com; img-src 'self' data: www.google-analytics.com www.googletagmanager.com; script-src 'self' 'unsafe-inline' cdnjs.cloudflare.com code.jquery.com stackpath.bootstrapcdn.com www.google-analytics.com www.googletagmanager.com; style-src 'self' 'unsafe-inline' cdnjs.cloudflare.com stackpath.bootstrapcdn.com; report-uri /csp-reports
  Server: Mojolicious (Perl)
  Content-Length: 37073
  Content-Type: text/html;charset=UTF-8
  Date: Tue, 24 Nov 2020 16:32:52 GMT
Longitud: 37073 (36K) [text/html]
Grabando a: “perl_index.html”

perl_index.html          100%[==================================>]  36,20K  --.-KB/s    en 0s      

2020-11-24 16:32:52 (237 MB/s) - “perl_index.html” guardado [37073/37073]

which is documented like this in the Web Service Log:

[2020-11-24 16:32:52.36986] [22] [debug] [b0G-OsON] GET "/5.28.1/perl"
[2020-11-24 16:32:52.37060] [22] [debug] [b0G-OsON] Routing to a callback
[2020-11-24 16:32:52.58280] [22] [debug] [b0G-OsON] Rendering cached template "perldoc.html.ep"
[2020-11-24 16:32:52.58383] [22] [debug] [b0G-OsON] Rendering cached template "menubar.html.ep"
[2020-11-24 16:32:52.58968] [22] [debug] [b0G-OsON] 200 OK (0.219804s, 4.550/s)
bodo-hugo-barwich commented 3 years ago

After having concluded the first Sprint I will work on the next Sprint which will be the integration with the PostgreSQL Service in a separate container. I looked at the image of PostgreSQL at the docker repository: Official Docker Image Dockerfile but I found they would compile it from source. I rather like the cleaner approach of installing it from the Main Stream Repositories as it is shown in the docker documentation article: Deploy Postgres on Docker

bodo-hugo-barwich commented 3 years ago

Thinking of a possible Use Case of this development as to create a serverless testing environment for Automated Tests as described on: GitHub Action Service Containers the setup becomes easier using prebuilt publically available images. So the Postgres Image for Alpine Linux can be easily configured as described at: Postgres Alpine Image with docker-compose The Image Documentation describes at: container configuration with Environment Variables that the Environment Variables POSTGRES_USER and POSTGRES_PASSWORD are used to setup the database. As the documentation explains an user account perldoc is created with a matching database perldoc.

As persistent Database Storage Location the data directory was assigned. Unfortunately this downside implications that the common developer user account can't build the Web Site Docker Image since the data directory will be owned by the postgres User Account:

$ docker-compose up --build web
Creating network "perldoc_web_default" with the default driver
Building web
Traceback (most recent call last):
  File "/usr/bin/docker-compose", line 11, in <module>
    load_entry_point('docker-compose==1.21.0', 'console_scripts', 'docker-compose')()
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 71, in main
    command()
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 127, in perform_command
    handler(command, command_options)
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1052, in up
    to_attach = up(False)
  File "/usr/lib/python3/dist-packages/compose/cli/main.py", line 1048, in up
    silent=options.get('--quiet-pull'),
  File "/usr/lib/python3/dist-packages/compose/project.py", line 466, in up
    svc.ensure_image_exists(do_build=do_build, silent=silent)
  File "/usr/lib/python3/dist-packages/compose/service.py", line 314, in ensure_image_exists
    self.build()
  File "/usr/lib/python3/dist-packages/compose/service.py", line 1027, in build
    platform=platform,
  File "/usr/lib/python3/dist-packages/docker/api/build.py", line 154, in build
    path, exclude=exclude, dockerfile=dockerfile, gzip=gzip
  File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 30, in tar
    files=sorted(exclude_paths(root, exclude, dockerfile=dockerfile[0])),
  File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 49, in exclude_paths
    return set(pm.walk(root))
  File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 214, in rec_walk
    for sub in rec_walk(cur):
  File "/usr/lib/python3/dist-packages/docker/utils/build.py", line 184, in rec_walk
    for f in os.listdir(current_dir):
PermissionError: [Errno 13] Permission denied: '/absolute/path/to/project/data'

Still the web Image can be built as root user on the Docker Host. But to resolve this limitation a custom database image would be need that would change the User ID of the postgres user to match the Development User Account ID

As documented in the Official Docker Documentation excluding files with .dockerignore the .dockerignore file can avoid this conflict and also speed up the build excluding the extensive .git directory from the build context.

Now common users with access to the docker service and build the image.

bodo-hugo-barwich commented 3 years ago

Now the Docker Cluster consists of 2 Containers:

$ docker-compose up -d db
Creating network "perldoc_web_default" with the default driver
Creating perldoc_db ... done

$ docker-compose ps
   Name                 Command              State           Ports         
---------------------------------------------------------------------------
perldoc_db   docker-entrypoint.sh postgres   Up      0.0.0.0:5432->5432/tcp

$ docker-compose up -d web
Creating perldoc_web ... done

$ docker-compose ps
   Name                  Command               State           Ports         
-----------------------------------------------------------------------------
perldoc_db    docker-entrypoint.sh postgres    Up      0.0.0.0:5432->5432/tcp
perldoc_web   entrypoint.sh perldoc-brow ...   Up      0.0.0.0:3000->3000/tcp

$ docker-compose down
Stopping perldoc_web ... done
Stopping perldoc_db  ... done
Removing perldoc_web ... done
Removing perldoc_db  ... done
Removing network perldoc_web_default

To access the database container the configuration perldoc-browser.conf needs to be changed to:

{
  # [...]
  pg => 'postgresql://perldoc:secret@db/perldoc',
  # [...]
  search_backend => 'pg',
  # [...]
}

within the Docker Cluster the container perldoc_db is known as just db. the docker-compose configuration exposes also the Port 5432 so the database can also be accessed from the Docker Host with localhost:5432.

But to use this configuration the perldoc-browser.pl must be run with docker-compose. The Perl Module Installation is done as local user only installation as it is common in serverless Environments. So it is convienent to use the entrypoint.sh Script to launch the Database Initialization:

$ docker-compose up -d
Starting perldoc_web ... done
Starting perldoc_db  ... done

$ docker-compose ps
   Name                  Command               State           Ports         
-----------------------------------------------------------------------------
perldoc_db    docker-entrypoint.sh postgres    Up      0.0.0.0:5432->5432/tcp
perldoc_web   entrypoint.sh perldoc-brow ...   Up      0.0.0.0:3000->3000/tcp

$ docker-compose exec web entrypoint.sh perldoc-browser.pl index all
Container 'b2046f460754.perldoc_web': 'entrypoint.sh' go ...
Container 'b2046f460754.perldoc_web' - Network:
127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.20.0.2^Ib2046f460754$
Command: 'perldoc-browser.pl index all'
Configuring Local Installation ...
PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;
Mojolicious Version: 9.21 [Code: '0']
Search Backend: pg
Mojo::Pg Version: 4.25 [Code: '0']
Service 'perldoc-browser.pl': Launching ...
# [...]

Here the important outputs are:

Search Backend: pg
Mojo::Pg Version: 4.25 [Code: '0']

that indicate that the PostgreSQL backend is recognized and the Perl modules are installed.

Now the web site cannot run without the database backend because the database host db cannot be resolved.

$ docker-compose up web
# [...]
perldoc_web | Search Backend: pg
perldoc_web | Mojo::Pg Version: 4.25 [Code: '0']
perldoc_web | Service 'perldoc-browser.pl': Launching ...
perldoc_web | DBI connect('dbname=perldoc;host=db','perldoc',...) failed: could not translate host name "db" to address: Name or service not known at /home/perldoc-browser/perl5/lib/perl5/Mojo/Pg.pm line 73.
perldoc_web exited with code 255
bodo-hugo-barwich commented 3 years ago

Now the last sprint involves introducing the ElasticSearch container in the cluster. The cpanfile documents that the ElasticSearch client is for the 6.X series:

feature 'es', 'Elasticsearch search backend', sub {
  requires 'Search::Elasticsearch' => '6.00';
  requires 'Search::Elasticsearch::Client::6_0';
  requires 'Log::Any::Adapter::MojoLog';
};

So, the latest version of this series is 6.8 which corresponds to the blacktop/elasticsearch:6.8 image. Within the cluster the ElasticSearch component will be known as elasticsearch which must be configured in the perldoc-browser.conf configuration file. The ElasticSearch data directory will be located in data/es/ and the PostgreSQL data directory will be moved to data/pg/. Initially those directories will be provided by the git repository but to make the PostgreSQL directory accessible for the PostgreSQL container its data directory needs to be changed to be owned by the postgres (uid: 70) user:

# chown 70:70 data/pg
# cd data
# pwd
/absolute/path/to/project/data
# chmod a+rx pg
# ls -lah pg
total 60K
drwxr-xr-x 19   70   70 4,0K nov 13 11:34 .
drwxr-xr-x  4 bodo bodo   60 nov 13 11:22 ..
drwxr-xr-x  6   70   70   54 ago 21 19:18 base
drwxr-xr-x  2   70   70 4,0K nov 13 11:35 global
-rw-r--r--  1 bodo bodo    0 nov 11 19:29 .keep
drwxr-xr-x  2   70   70    6 ago 21 19:18 pg_commit_ts
drwxr-xr-x  2   70   70    6 ago 21 19:18 pg_dynshmem
-rw-r--r--  1   70   70 4,5K ago 21 19:18 pg_hba.conf
-rw-r--r--  1   70   70 1,6K ago 21 19:18 pg_ident.conf
drwxr-xr-x  4   70   70   68 nov 13 11:39 pg_logical
drwxr-xr-x  4   70   70   36 ago 21 19:18 pg_multixact
drwxr-xr-x  2   70   70   18 nov 13 11:34 pg_notify
drwxr-xr-x  2   70   70    6 ago 21 19:18 pg_replslot
drwxr-xr-x  2   70   70    6 ago 21 19:18 pg_serial
drwxr-xr-x  2   70   70    6 ago 21 19:18 pg_snapshots
drwxr-xr-x  2   70   70    6 nov 13 11:34 pg_stat
drwxr-xr-x  2   70   70   63 nov 13 11:56 pg_stat_tmp
drwxr-xr-x  2   70   70   18 ago 21 19:18 pg_subtrans
drwxr-xr-x  2   70   70    6 ago 21 19:18 pg_tblspc
drwxr-xr-x  2   70   70    6 ago 21 19:18 pg_twophase
-rw-r--r--  1   70   70    3 ago 21 19:18 PG_VERSION
drwxr-xr-x  3   70   70  220 ago 22 12:57 pg_wal
drwxr-xr-x  2   70   70   18 ago 21 19:18 pg_xact
-rw-r--r--  1   70   70   88 ago 21 19:18 postgresql.auto.conf
-rw-r--r--  1   70   70  24K ago 21 19:18 postgresql.conf
-rw-r--r--  1   70   70   24 nov 13 11:34 postmaster.opts
-rw-------  1   70   70   94 nov 13 11:34 postmaster.pid

In order to commit the file data/pg/.keep the data/pg/ directory needed to be changed to be accessible after each Postgres launch because Postgres will modify the access on each launch to restrict access.

On fresh install if the Postgres data directory is not changed to be accessible by the postgres user the database will fail to start up: The container will show as failed in the docker-compose status report.

$ docker-compose ps
   Name                  Command               State                 Ports              
----------------------------------------------------------------------------------------
perldoc_db    docker-entrypoint.sh postgres    Exit 1                                   
perldoc_es    /elastic-entrypoint.sh ela ...   Up       0.0.0.0:9200->9200/tcp, 9300/tcp
perldoc_web   entrypoint.sh perldoc-brow ...   Up       0.0.0.0:3000->3000/tcp  
$ docker-compose up db
Creating network "perldoc_web_default" with the default driver
Creating perldoc_db ... done
Attaching to perldoc_db
perldoc_db       | chmod: /var/lib/postgresql/data: Operation not permitted
perldoc_db       | chmod: /var/run/postgresql: Operation not permitted
perldoc_db       | initdb: could not look up effective user ID 1000: user does not exist
perldoc_db exited with code 1

On initial install the ElasticSearch container is unable to start up because of default memory restrictions:

$ docker-compose ps
   Name                  Command                State            Ports         
-------------------------------------------------------------------------------
perldoc_db    docker-entrypoint.sh postgres    Exit 1                          
perldoc_es    /elastic-entrypoint.sh ela ...   Exit 78                         
perldoc_web   entrypoint.sh perldoc-brow ...   Up        0.0.0.0:3000->3000/tcp
$ docker logs perldoc_es
warning: Falling back to java on path. This behavior is deprecated. Specify JAVA_HOME
[2021-11-13T09:49:48,221][WARN ][o.e.c.l.LogConfigurator  ] [unknown] Some logging configurations have %marker but don't have %node_name. We will automatically add %node_name to the pattern to ease the migration for users who customize log4j2.properties but will stop this behavior in 7.0. You should manually replace `%node_name` with `[%node_name]%marker ` in these locations:
  /usr/share/elasticsearch/config/log4j2.properties
[2021-11-13T09:49:50,449][INFO ][o.e.e.NodeEnvironment    ] [Evou766] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/mapper/vg_laptop--bodo-lv_home)]], net usable_space [175.4gb], net total_space [186.1gb], types [xfs]
[2021-11-13T09:49:50,450][INFO ][o.e.e.NodeEnvironment    ] [Evou766] heap size [1007.3mb], compressed ordinary object pointers [true]
[2021-11-13T09:49:50,453][INFO ][o.e.n.Node               ] [Evou766] node name derived from node ID [Evou766tQwqgAOXMj5emuw]; set [node.name] to override
[2021-11-13T09:49:50,454][INFO ][o.e.n.Node               ] [Evou766] version[6.8.13], pid[1], build[oss/tar/be13c69/2020-10-16T09:09:46.555371Z], OS[Linux/4.19.0-18-amd64/amd64], JVM[IcedTea/OpenJDK 64-Bit Server VM/1.8.0_242/25.242-b08]
[2021-11-13T09:49:50,454][INFO ][o.e.n.Node               ] [Evou766] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/usr/share/elasticsearch/tmp, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.cgroups.hierarchy.override=/, -Des.path.home=/usr/share/elasticsearch, -Des.path.conf=/usr/share/elasticsearch/config, -Des.distribution.flavor=oss, -Des.distribution.type=tar]
[2021-11-13T09:49:56,930][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [aggs-matrix-stats]
[2021-11-13T09:49:56,930][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [analysis-common]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [ingest-common]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [ingest-geoip]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [ingest-user-agent]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [lang-expression]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [lang-mustache]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [lang-painless]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [mapper-extras]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [parent-join]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [percolator]
[2021-11-13T09:49:56,931][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [rank-eval]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [reindex]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [repository-url]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [transport-netty4]
[2021-11-13T09:49:56,932][INFO ][o.e.p.PluginsService     ] [Evou766] loaded module [tribe]
[2021-11-13T09:49:56,933][INFO ][o.e.p.PluginsService     ] [Evou766] no plugins loaded
[2021-11-13T09:50:13,538][INFO ][o.e.d.DiscoveryModule    ] [Evou766] using discovery type [zen] and host providers [settings]
[2021-11-13T09:50:15,808][INFO ][o.e.n.Node               ] [Evou766] initialized
[2021-11-13T09:50:15,809][INFO ][o.e.n.Node               ] [Evou766] starting ...
[2021-11-13T09:50:16,474][INFO ][o.e.t.TransportService   ] [Evou766] publish_address {172.19.0.4:9300}, bound_addresses {0.0.0.0:9300}
[2021-11-13T09:50:16,526][INFO ][o.e.b.BootstrapChecks    ] [Evou766] bound or publishing to a non-loopback address, enforcing bootstrap checks
ERROR: [1] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2021-11-13T09:50:16,550][INFO ][o.e.n.Node               ] [Evou766] stopping ...
[2021-11-13T09:50:16,634][INFO ][o.e.n.Node               ] [Evou766] stopped
[2021-11-13T09:50:16,634][INFO ][o.e.n.Node               ] [Evou766] closing ...
[2021-11-13T09:50:16,678][INFO ][o.e.n.Node               ] [Evou766] closed

So, the memory limit must be increased as root user:

# sysctl -w vm.max_map_count=262144
vm.max_map_count = 262144

There is a great documentation about the effects that configuration has on the Docker Host System published by the Suse Distribution at Documentation on vm.max_map_count

To use the ElasticSearch search backend this must be configured in the perldoc-browser.conf configuration file:

  es => 'http://elasticsearch:9200',
  search_backend => 'es',

Before the search backend is usable the indexation script must be run with:

$ docker-compose exec web entrypoint.sh perldoc-browser.pl index all
Container 'db3095cd2a39.perldoc_web': 'entrypoint.sh' go ...
Container 'db3095cd2a39.perldoc_web' - Network:
127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.19.0.4^Idb3095cd2a39$
Command: 'perldoc-browser.pl index all'
Configuring Local Installation ...
PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;
Mojolicious Version: 9.21 [Code: '0']
Search Backend: es
Search::Elasticsearch Version: 7.715 [Code: '0']
Service 'perldoc-browser.pl': Launching ...
[2021-11-13 10:11:18.25728] [26] [info] Current cxns: ["http://elasticsearch:9200"]
[2021-11-13 10:11:18.25758] [26] [info] Forcing ping before next use on all live cxns
[2021-11-13 10:11:18.25780] [26] [info] Ping [http://elasticsearch:9200] before next request
[2021-11-13 10:11:30.97793] [26] [info] Pinging [http://elasticsearch:9200]
[2021-11-13 10:11:31.29653] [26] [info] Marking [http://elasticsearch:9200] as live
# [...]
Swapping faqs_5.28.1 index(es)  => faqs_5.28.1_1636798290
Swapping pods_5.28.1 index(es)  => pods_5.28.1_1636798290
Swapping functions_5.28.1 index(es)  => functions_5.28.1_1636798290
Swapping perldeltas_5.28.1 index(es)  => perldeltas_5.28.1_1636798290
Swapping variables_5.28.1 index(es)  => variables_5.28.1_1636798290

After the indexation the ElasticSearch component has 5 indices for each Perl version:

$ curl -v http://localhost:9200/_cat/indices
# [...]
*   Trying ::1...
* TCP_NODELAY set
* Expire in 149997 ms for 3 (transfer 0x55f60a399e20)
* Expire in 200 ms for 4 (transfer 0x55f60a399e20)
* Connected to localhost (::1) port 9200 (#0)
> GET /_cat/indices HTTP/1.1
> Host: localhost:9200
> User-Agent: curl/7.64.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< content-type: text/plain; charset=UTF-8
< content-length: 455
< 
yellow open perldeltas_5.28.1_1636798290 BTS4QdaeQk6OJLFnyYUI9g 1 1 2164 0   3.2mb   3.2mb
yellow open faqs_5.28.1_1636798290       gyrqSq7mQrKXzAmQJ4cGVA 1 1  305 0 784.9kb 784.9kb
yellow open variables_5.28.1_1636798290  wjDlOrQrRaWb77HTKhdA5Q 1 1  150 0  17.1kb  17.1kb
yellow open pods_5.28.1_1636798290       PJ-EZ0IbQb67EOzkGrVj1w 1 1 1579 0  23.2mb  23.2mb
yellow open functions_5.28.1_1636798290  xzukrTriSNWiyPqKMpZU4w 1 1  292 0 570.5kb 570.5kb
* Connection #0 to host localhost left intact

and the "foo" search will produce a complete search result:

$ curl -v -o search_foo.html "http://localhost:3000/5.28.1/search?q=foo"
# [...]
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* Expire in 0 ms for 1 (transfer 0x55a2ab834e20)
# [...]
*   Trying ::1...
* TCP_NODELAY set
* Expire in 149998 ms for 3 (transfer 0x55a2ab834e20)
* Expire in 200 ms for 4 (transfer 0x55a2ab834e20)
* Connected to localhost (::1) port 3000 (#0)
> GET /5.28.1/search?q=foo HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/7.64.0
> Accept: */*
> 
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0< HTTP/1.1 200 OK
< Date: Sat, 13 Nov 2021 11:04:38 GMT
< Content-Length: 52296
< Content-Security-Policy: default-src 'self'; connect-src 'self' www.google-analytics.com; img-src 'self' data: www.google-analytics.com www.googletagmanager.com; script-src 'self' 'unsafe-inline' cdnjs.cloudflare.com code.jquery.com stackpath.bootstrapcdn.com www.google-analytics.com www.googletagmanager.com; style-src 'self' 'unsafe-inline' cdnjs.cloudflare.com stackpath.bootstrapcdn.com; report-uri /csp-reports
< Content-Type: text/html;charset=UTF-8
< Server: Mojolicious (Perl)
< 
{ [52296 bytes data]
100 52296  100 52296    0     0  36596      0  0:00:01  0:00:01 --:--:-- 36596
* Connection #0 to host localhost left intact
bodo-hugo-barwich commented 3 years ago

Before running the indexation it is important to check whether the ElasticSearch component is already ready for service with:

$ curl -v http://localhost:9200
# [...]
*   Trying ::1...
* TCP_NODELAY set
* Expire in 150000 ms for 3 (transfer 0x56117c89ae20)
* Expire in 200 ms for 4 (transfer 0x56117c89ae20)
* Connected to localhost (::1) port 9200 (#0)
> GET / HTTP/1.1
> Host: localhost:9200
> User-Agent: curl/7.64.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 490
< 
{
  "name" : "Evou766",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "D7p_jR1TQBeK7J69Hk3QRg",
  "version" : {
    "number" : "6.8.13",
    "build_flavor" : "oss",
    "build_type" : "tar",
    "build_hash" : "be13c69",
    "build_date" : "2020-10-16T09:09:46.555371Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.3",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}
* Connection #0 to host localhost left intact

otherwise the Search::Elasticsearch::Role::Cxn Role will produce a NoNodes Exception and the indexation will fail:

$ docker-compose up -d
Creating network "perldoc_web_default" with the default driver
Creating perldoc_es  ... done
Creating perldoc_db  ... done
Creating perldoc_web ... done
$ docker-compose exec web entrypoint.sh perldoc-browser.pl index all
Container 'de674593309b.perldoc_web': 'entrypoint.sh' go ...
Container 'de674593309b.perldoc_web' - Network:
127.0.0.1^Ilocalhost$ ::1^Ilocalhost ip6-localhost ip6-loopback$ fe00::0^Iip6-localnet$ ff00::0^Iip6-mcastprefix$ ff02::1^Iip6-allnodes$ ff02::2^Iip6-allrouters$ 172.25.0.3^Ide674593309b$
Command: 'perldoc-browser.pl index all'
Configuring Local Installation ...
PATH="/home/perldoc-browser/perl5/bin${PATH:+:${PATH}}"; export PATH;
PERL5LIB="/home/perldoc-browser/perl5/lib/perl5${PERL5LIB:+:${PERL5LIB}}"; export PERL5LIB;
PERL_LOCAL_LIB_ROOT="/home/perldoc-browser/perl5${PERL_LOCAL_LIB_ROOT:+:${PERL_LOCAL_LIB_ROOT}}"; export PERL_LOCAL_LIB_ROOT;
PERL_MB_OPT="--install_base \"/home/perldoc-browser/perl5\""; export PERL_MB_OPT;
PERL_MM_OPT="INSTALL_BASE=/home/perldoc-browser/perl5"; export PERL_MM_OPT;

Mojolicious Version: 9.21 [Code: '0']
Search Backend: es
Search::Elasticsearch Version: 7.715 [Code: '0']
Service 'perldoc-browser.pl': Launching ...
[2021-11-13 12:52:59.01476] [12] [info] Current cxns: ["http://elasticsearch:9200"]
[2021-11-13 12:52:59.01538] [12] [info] Forcing ping before next use on all live cxns
[2021-11-13 12:52:59.01664] [12] [info] Ping [http://elasticsearch:9200] before next request
[2021-11-13 12:53:41.98002] [12] [info] Pinging [http://elasticsearch:9200]
[2021-11-13 12:53:44.37398] [12] [debug] [Cxn] ** [http://elasticsearch:9200]-[599] Could not connect to 'elasticsearch:9200': Connection refused, called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at /home/perldoc-browser/lib/PerldocBrowser/Plugin/PerldocSearch/Elastic.pm line 352. With vars: {'status_code' => 599,'request' => {'method' => 'HEAD','timeout' => 2,'path' => '/'}}

[2021-11-13 12:53:44.37455] [12] [info] Marking [http://elasticsearch:9200] as dead. Next ping at: Sat Nov 13 12:54:44 2021
[2021-11-13 12:53:44.39780] [12] [fatal] [NoNodes] ** No nodes are available: [http://elasticsearch:9200], called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at /home/perldoc-browser/lib/PerldocBrowser/Plugin/PerldocSearch/Elastic.pm line 352.
[NoNodes] ** No nodes are available: [http://elasticsearch:9200], called from sub Search::Elasticsearch::Role::Client::Direct::__ANON__ at /home/perldoc-browser/lib/PerldocBrowser/Plugin/PerldocSearch

Because the ElasticSearch is known to be slow to start up which is also documented at: ElasticSearch slow Startup produces Index Corruption

bodo-hugo-barwich commented 2 years ago

Reproducing the build on a new system it turns out that no the ownership of the ProgreSQL data directory is an obstacle but the committed file data/pg/.keep inside of it:

perldoc_db       | initdb: directory "/var/lib/postgresql/data" exists but is not empty
perldoc_db       | It contains a dot-prefixed/invisible file, perhaps due to it being a mount point.
perldoc_db exited with code 1

So the data/pg/.keep file must be removed to make the database installation work.

The ownership of the data directory will be changed automatically during the installation.