Closed antbrown closed 2 years ago
Couple of quick notes, I tried with 1.0.0-alpha-14 and it seems to be working.
I also noticed on https://hub.docker.com/r/islandora/solr/tags there are 1.0.x releases, I tried 1.0.5 but it also didn't work, with the same error about the logs directory not being writeable.
Are these 1.0.x releases meant to be used as the TAG in our .env
now? ie: https://github.com/Islandora-Devops/isle-dc/blob/development/sample.env#L68
@nigelgbanks do you think we should mkdir and/or chown/chmod that path in here? https://github.com/Islandora-Devops/isle-buildkit/blob/main/solr/Dockerfile
I'll admit, we haven't hit this issue at BD.
The directory doesn't exists until solr
program starts, it is created by the solr
program which runs as the solr
user, all the folders under /opt/solr
are owned by the solr
user. So on it's own the solr
service is fine.
The error:
ERROR: Logs directory /opt/solr/server/logs is not writable. Exiting
Must come from sharing directory across the two services drupal
here and solr
here. I believe this is done to export the content from search_api_solr module for solr to use.
One potential solution might be to use additional properties on the volume to show to which it actually belongs to. The following might work if set on the drupal
service.
- type: volume
source: solr-data
target: /opt/solr/server/solr
volume:
nocopy: true
There may be other settings etc, worth looking into that would help. I believe this is a race condition kind of bug that would only come up if the drupal
service was ready and wrote the solr
configuration before solr
started up and created the logs
directory.
Hi, thanks for getting back to me.
@nigelgbanks that directory looks different to the one I'm having issues with, ie /opt/solr/server/solr
vs /opt/solr/server/logs
Can you confirm whether you see the same error as me when trying to start solr by itself, using the 1.0.0-alpha-15 tag?
docker run --rm -it islandora/solr:1.0.0-alpha-15
versus, a working copy (for me):
docker run --rm -it islandora/solr:1.0.0-alpha-14
It looks like there were a bunch of changes made to logging here: https://github.com/Islandora-Devops/isle-buildkit/commit/7ecc48421b65f3bb88fe15ce1e66f0fa5cb4f2af, they look related but I don't understand the log4j config enough to know exactly what is going on.
@nigelgbanks that directory looks different to the one I'm having issues with, ie /opt/solr/server/solr vs /opt/solr/server/logs
Ah yes your right, that's what I get for trying to respond during a meeting :D
Both of those versions when run on their own work for me, which would make me expect this is something to do with how isle-dc
is setup.
Though if docker run --rm -it islandora/solr:1.0.0-alpha-15
does not work on it's own for you that is very very strange.
Does what you have locally match the digest for Docker Hub? (unfortunately you can't check with the UI)
Check your local image:
docker image inspect islandora/solr:1.0.0-alpha-15 --format '{{json .RepoDigests}}' | jq .
# Yields
[
"islandora/solr@sha256:8cd1564698f54acc6ee76feadc3e2d22d9b9a6f851609f30d3f0bf531f906c0f"
]
Digest from docker hub:
repo="islandora/solr"
tag="1.0.0-alpha-15"
acceptM="application/vnd.docker.distribution.manifest.v2+json"
acceptML="application/vnd.docker.distribution.manifest.list.v2+json"
token=$(curl -s "https://auth.docker.io/token?service=registry.docker.io&scope=repository:${repo}:pull" \
| jq -r '.token')
curl -H "Accept: ${acceptM}" \
-H "Accept: ${acceptML}" \
-H "Authorization: Bearer $token" \
-I -s "https://registry-1.docker.io/v2/${repo}/manifests/${tag}" | grep etag
# Yields
etag: "sha256:8cd1564698f54acc6ee76feadc3e2d22d9b9a6f851609f30d3f0bf531f906c0f"
Also in both cases the logs
directory is created by solr when starting.
docker run --rm -it --entrypoint ls islandora/solr:1.0.0-alpha-14 -lah /opt/solr/server 622ms Tue 28 Jun 23:10:41 2022
total 188K
drwxr-xr-x 10 solr solr 4.0K Mar 3 15:33 .
drwxr-xr-x 6 solr solr 4.0K Mar 3 15:33 ..
-rw-r--r-- 1 solr solr 3.9K Oct 13 2017 README.txt
drwxr-xr-x 2 solr solr 4.0K Mar 3 15:33 contexts
drwxr-xr-x 2 solr solr 4.0K Mar 3 15:33 etc
drwxr-xr-x 3 solr solr 4.0K Mar 3 15:33 lib
drwxr-xr-x 2 solr solr 4.0K Mar 3 15:33 modules
drwxr-xr-x 2 solr solr 4.0K Mar 3 15:33 resources
drwxr-xr-x 3 solr solr 4.0K Mar 3 15:33 scripts
drwxr-xr-x 2 solr solr 4.0K Mar 3 15:33 solr
drwxr-xr-x 3 solr solr 4.0K Mar 3 15:33 solr-webapp
-rw-r--r-- 1 solr solr 142.4K May 31 2017 start.jar
docker run --rm -it --entrypoint ls islandora/solr:1.0.0-alpha-15 -lah /opt/solr/server 456ms Tue 28 Jun 23:10:43 2022
total 188K
drwxr-xr-x 10 solr solr 4.0K Mar 24 11:39 .
drwxr-xr-x 6 solr solr 4.0K Mar 24 11:40 ..
-rw-r--r-- 1 solr solr 3.9K Oct 13 2017 README.txt
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 contexts
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 etc
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 lib
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 modules
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 resources
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 scripts
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 solr
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 solr-webapp
-rw-r--r-- 1 solr solr 142.4K May 31 2017 start.jar
Hi @nigelgbanks,
I do get the same digest as you:
"islandora/solr@sha256:8cd1564698f54acc6ee76feadc3e2d22d9b9a6f851609f30d3f0bf531f906c0f"
- yours"islandora/solr@sha256:8cd1564698f54acc6ee76feadc3e2d22d9b9a6f851609f30d3f0bf531f906c0f"
- mineFor some reason (I may be blind) I can't see the logs directory in the output you pasted above, either in the alpha-14 or alpha-15 directory listings.
I have tried deleting the image and starting from scratch with:
docker image rm islandora/solr:1.0.0-alpha-15
docker pull islandora/solr:1.0.0-alpha-15
docker run --rm -it islandora/solr:1.0.0-alpha-15
But, I get the same error about the logs file not being writeable.
I'll see if there is another way I can purge things to start from scratch, there must be something I'm doing that has upset the apple cart.
Thanks for your help!
For some reason (I may be blind) I can't see the logs directory in the output you pasted above, either in the alpha-14 or alpha-15 directory listings.
Your not blind, that was my point, it doesn't exist in the container. It's not until solr
program starts that the folder is created.
Hmm at this point the only thing I can think of might be some extra security layer like apparmour
or selinux
.
Or perhaps a very old kernel? Though syscalls
used by things like mkdir
should probably be fine across all the versions that docker supports. So like docker isn't virtualized it's running on the host linux and using the hosts kernel to do things like open files send packets on the network what have you. As such there can be discrepancies between libc
versions if for example the container is using muslibc
(which ours does) that targets a different version of the linux kernel than the host system. Typically this is fine as most syscalls
are binary backwards compatible. I really doubt it is this, but then I've seen it on custom linux builds before where things like cd
didn't work inside the container due to differences in the ABI
between the libc
in the container and the kernel.
Can you do a docker info
and put the results on the ticket.
# In a Virtual machine with SELinux enabled note the security options
docker info
# Yields
Client:
Context: default
Debug Mode: false
Plugins:
compose: Docker Compose (Docker Inc., v2.6.0)
Server:
Containers: 16
Running: 16
Paused: 0
Stopped: 0
Images: 16
Server Version: 20.10.12
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: journald
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: /usr/libexec/docker/docker-init
containerd version:
runc version: 9ac869a-dirty
init version:
Security Options:
seccomp
Profile: default
selinux
cgroupns
Kernel Version: 5.16.16-200.fc35.x86_64
Operating System: Fedora CoreOS 35.20220327.3.0
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.741GiB
Name: localhost.localdomain
ID: WBLM:4MOO:7E66:BIC2:UJOQ:SOPN:U3JK:SGHQ:DBLW:5CHV:SKK4:OBPK
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: falseL
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: true
# Native on a NixOS Linux Server
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., 0.0.0+unknown)
compose: Docker Compose (Docker Inc., v2.0.1)
Server:
Containers: 4
Running: 1
Paused: 0
Stopped: 3
Images: 147
Server Version: 20.10.17
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: journald
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: v1.6.4
runc version:
init version:
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.15.47
Operating System: NixOS 22.05 (Quokka)
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 62.73GiB
Name: shadow
ID: XPBY:OGLQ:FV6H:WZI2:ERYQ:OLVG:A3G2:3XXV:5PAH:DUNV:IAIB:5SOW
Docker Root Dir: /nix/persist/var/lib/docker
Debug Mode: false
Username: islandoracommunity
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: true
Hi @nigelgbanks,
Thanks for the informative reply - I'm learning things :)
Below is my docker info
output, it looks like I have an older kernel version, and under the security section it has apparmor.
Keen to know what you can glean from this:
Client:
Debug Mode: false
Plugins:
buildx: Build with BuildKit (Docker Inc., v0.6.2)
Server:
Containers: 40
Running: 17
Paused: 0
Stopped: 23
Images: 356
Server Version: 19.03.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.13.0-51-generic
Operating System: Ubuntu 20.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 31GiB
Name: antbrown.catalyst.net.nz
ID: BJQI:OTK5:L5BK:JT3A:72QN:WBMA:O5DD:ZOU5:YYSG:XSFT:Z7SA:FOHV
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Nothing there stands out as problematic to me, might be the apparmour
profile?
Does the following work for you?
docker run --rm -it --security-opt apparmor=unconfined islandora/solr:1.0.0-al
pha-15
No dice :(
docker run --rm -it --security-opt apparmor=unconfined islandora/solr:1.0.0-alpha-15
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 00-container-environment-00-init.sh: executing...
[cont-init.d] 00-container-environment-00-init.sh: exited 0.
[cont-init.d] 00-container-environment-02-database-defaults.sh: executing...
[cont-init.d] 00-container-environment-02-database-defaults.sh: exited 0.
[cont-init.d] 01-confd-render-templates.sh: executing...
[cont-init.d] 01-confd-render-templates.sh: exited 0.
[cont-init.d] 02-solr-setup.sh: executing...
[cont-init.d] 02-solr-setup.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
ERROR: Logs directory /opt/solr/server/logs is not writable. Exiting
[services.d] service solr finish: executing...
[services.d] service solr exiting with exit code: 1
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.
[s6-finish] sending all processes the KILL signal and exiting.
Super weird...
Can you try this?
# Create the container, but launch into the "ash" shell rather than starting the solr service.
docker run --rm -ti --entrypoint ash islandora/solr:1.0.0-alpha-15
# Check the permissions and files in the solr install directory.
ls -lah /opt/solr/server/
total 188K
drwxr-xr-x 10 solr solr 4.0K Mar 24 11:39 .
drwxr-xr-x 6 solr solr 4.0K Mar 24 11:40 ..
-rw-r--r-- 1 solr solr 3.9K Oct 13 2017 README.txt
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 contexts
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 etc
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 lib
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 modules
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 resources
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 scripts
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 solr
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 solr-webapp
-rw-r--r-- 1 solr solr 142.4K May 31 2017 start.jar
# Start solr services
/init
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 00-container-environment-00-init.sh: executing...
[cont-init.d] 00-container-environment-00-init.sh: exited 0.
[cont-init.d] 00-container-environment-02-database-defaults.sh: executing...
[cont-init.d] 00-container-environment-02-database-defaults.sh: exited 0.
[cont-init.d] 01-confd-render-templates.sh: executing...
[cont-init.d] 01-confd-render-templates.sh: exited 0.
[cont-init.d] 02-solr-setup.sh: executing...
[cont-init.d] 02-solr-setup.sh: exited 0.
# ...
# After it fails and exits do another directory listing.
ls -lah /opt/solr/server/
total 204K
drwxr-xr-x 1 solr solr 4.0K Jun 29 09:55 .
drwxr-xr-x 1 solr solr 4.0K Mar 24 11:40 ..
-rw-r--r-- 1 solr solr 3.9K Oct 13 2017 README.txt
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 contexts
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 etc
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 lib
drwxr-xr-x 3 solr solr 4.0K Jun 29 09:55 logs
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 modules
drwxr-xr-x 2 solr solr 4.0K Mar 24 11:39 resources
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 scripts
drwxr-xr-x 1 solr solr 4.0K Mar 24 11:39 solr
drwxr-xr-x 3 solr solr 4.0K Mar 24 11:39 solr-webapp
-rw-r--r-- 1 solr solr 142.4K May 31 2017 start.jar
Oh and also try this:
docker run --privileged --rm -ti islandora/solr:1.0.0-alpha-15
If neither of those garner any insight we'll have to restore to debugging.
mkdir /tmp/strace
docker run --privileged --rm -ti -v /tmp/strace:/tmp/strace --entrypoint bash islandora/solr:1.0.0-alpha-15 -c "apk add strace; strace -o /tmp/strace/logs -t -ff /init"
tar -czvf /tmp/strace.tgz /tmp/strace
Then upload the tgz
file to the Github issue
@nigelgbanks good thinkings, I tried checking the permissions before and after startup and they remained unchanged. However, when I run the command using the --privileged
flag like you suggested I get a working Solr instance:
docker run --privileged --rm -ti islandora/solr:1.0.0-alpha-15
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 00-container-environment-00-init.sh: executing...
[cont-init.d] 00-container-environment-00-init.sh: exited 0.
[cont-init.d] 00-container-environment-02-database-defaults.sh: executing...
[cont-init.d] 00-container-environment-02-database-defaults.sh: exited 0.
[cont-init.d] 01-confd-render-templates.sh: executing...
[cont-init.d] 01-confd-render-templates.sh: exited 0.
[cont-init.d] 02-solr-setup.sh: executing...
[cont-init.d] 02-solr-setup.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
2022-06-29 10:21:34.314 INFO (main) [ ] o.e.j.s.Server jetty-9.3.20.v20170531
2022-06-29 10:21:34.794 INFO (main) [ ] o.a.s.s.SolrDispatchFilter ___ _ Welcome to Apache Solr™ version 7.1.0
2022-06-29 10:21:34.795 INFO (main) [ ] o.a.s.s.SolrDispatchFilter / __| ___| |_ _ Starting in standalone mode on port 8983
2022-06-29 10:21:34.795 INFO (main) [ ] o.a.s.s.SolrDispatchFilter \__ \/ _ \ | '_| Install dir: /opt/solr, Default config dir: /opt/solr/server/solr/configsets/_default/conf
2022-06-29 10:21:34.825 INFO (main) [ ] o.a.s.s.SolrDispatchFilter |___/\___/_|_| Start time: 2022-06-29T10:21:34.802Z
2022-06-29 10:21:34.843 INFO (main) [ ] o.a.s.c.SolrResourceLoader Using system property solr.solr.home: /opt/solr/server/solr
2022-06-29 10:21:34.852 INFO (main) [ ] o.a.s.c.SolrXmlConfig Loading container configuration from /opt/solr/server/solr/solr.xml
2022-06-29 10:21:35.381 INFO (main) [ ] o.a.s.c.CorePropertiesLocator Found 0 core definitions underneath /opt/solr/server/solr
2022-06-29 10:21:35.447 INFO (main) [ ] o.e.j.s.Server Started @1635ms
Will I need to find a way to run in privileged mode by default, so I can make use of the Makefile commands? I guess it has something to do with my user not having the right permissions to create folders?? Mmm, that doesn't make sense. I'm still confused as to what the issue is, but you've certainly found a solution so thank you!
Oops, I missed your last message about debug info, that looks like fun and I haven’t done it before so I’ll give it a go in the morning. I have to go to bed now unfortunately. Thanks for your help.
I wouldn't recommend running in --privileged
it's not very safe it can allow for an attacker to easily escape the container and get access to the host system, at least now we have a clue.
We might be able to find out more if we log all the system calls using strace
it may show why mkdir
failed.
mkdir /tmp/strace
docker run --cap-add=SYS_PTRACE --rm -ti -v /tmp/strace:/tmp/strace --entrypoint bash islandora/solr:1.0.0-alpha-15 -c "apk add strace; strace -o /tmp/strace/logs -t -ff /init"
tar -czvf /tmp/strace.tgz /tmp/strace
Then upload the tgz
file to the Github issue
Oh be sure when you gather the debug info use the updated command I pasted with --cap-add=SYS_PTRACE
instead of --privileged
as we want to test it failing.
Hmm I think the default seccomp profile for your install might be a bit messed up.
You could do a full re-install of docker
and that may sort the issue.
Otherwise you could download the default profile and get dockerd
to use it explicitly.
This is the default profile that docker recommends.
You could then add the following line to /etc/docker/daemon.json
{
"seccomp-profile": "/LOCATION/OF/DOWNLOADED/default.json",
}
Restart docker.
sudo systemctl restart docker
At which point
docker run --rm -ti islandora/solr:1.0.0-alpha-15
Should work.
Good morning @nigelgbanks,
I had a read of the seccomp page you linked to and downloaded the default profile and followed instructions to make docker pick it up, but got this error when starting the solr container:
docker: Error response from daemon: OCI runtime create failed:
container_linux.go:345: starting container process caused "error adding seccomp filter rule
for syscall clone3: permission denied": unknown.
I didn't quite understand it but after reading a few github issues where people were having similar issues it appears runc and/or containerd.io versions were to blame, so I started from scratch and re-installed docker, and now solr alpha-15 boots up fine. The installation instructions were different to the last time I installed docker, which makes me wonder how often I should revisit it instead of relying on my normal apt-get upgrade.
I did also run the strace and looked through the output to find all references to the logs directory, it was interesting to learn about the nuts and bolts of what the init script is doing, but I will need to do a bit more learning before I understand all that is going on.
It's been quite the journey and I'd like to thank you for all the help you've provided in getting me to a place where I can run isle-dc alpha-15 and not have to be stuck on an older version, many thanks! :)
I am running on ubuntu 20.04 and during the process of updating my isle-dc environment, I hit this same issue (ERROR: Logs directory /opt/solr/server/logs is not writable. Exiting)
I upgraded docker from 19.03.13 to 20.10.18. Dependencies caused containerd to be upgraded from 1.3.7. to 1.6.8.
I'm confirming that the upgrade resolved this issue.
Hello,
I used to be able to run
make demo
and everything worked out of the box a few months ago, but recently this no longer seems to be the case, I get the following error during start up:And in the
docker-compose logs
the solr container continually starts/restarts with the error:ERROR: Logs directory /opt/solr/server/logs is not writable. Exiting
.I can verify that directory doesn't exist in the container by running:
And I can add a volume and mount it to that directory to make the error go away:
However, when I try to modify the
build/docker-compose/docker-compose.demo.yml
and do amake docker-compose
, it doesn't seem to fix anything.Any help is much appreciated.