Open jkakavas opened 2 years ago
Pinging @elastic/es-security (Team:Security)
Clarification: From the stack trace, AutoConfigureNode CLI is experiencing the error, not Elasticsearch.
Startup: Container => /usr/local/bin/docker-entrypoint.sh
=> /usr/share/elasticsearch/bin/elasticsearch
Looking at /usr/share/elasticsearch/bin/elasticsearch
, it seems like the variable ATTEMPT_SECURITY_AUTO_CONFIG=true triggers a call to AutoConfigureNode CLI before Elasticsearch. The stack trace is for AutoConfigureNode CLI, not Elasticsearch.
Excerpt of the AutoConfigure CLI command:
ES_MAIN_CLASS=org.elasticsearch.xpack.security.cli.AutoConfigureNode \
ES_ADDITIONAL_SOURCES="x-pack-env;x-pack-security-env" \
ES_ADDITIONAL_CLASSPATH_DIRECTORIES=lib/tools/security-cli \
bin/elasticsearch-cli "${ARG_LIST[@]}" <<<"$KEYSTORE_PASSWORD"
Excerpt of the Elasticsearch daemon command:
"$JAVA" \
"$XSHARE" \
$ES_JAVA_OPTS \
-Des.path.home="$ES_HOME" \
-Des.path.conf="$ES_PATH_CONF" \
-Des.distribution.flavor="$ES_DISTRIBUTION_FLAVOR" \
-Des.distribution.type="$ES_DISTRIBUTION_TYPE" \
-Des.bundled_jdk="$ES_BUNDLED_JDK" \
-cp "$ES_CLASSPATH" \
org.elasticsearch.bootstrap.Elasticsearch \
"${ARG_LIST[@]}" \
<<<"$KEYSTORE_PASSWORD" &
Reproduce original issue by executing
> docker run --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v C:\Docker\elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0
Exception in thread "main" java.nio.file.FileSystemException: /usr/share/elasticsearch/config/elasticsearch.yml.Occjcc_mS06vpoRLwlpUwA.tmp -> /usr/share/elasticsearch/config/elasticsearch.yml: Device or resource busy
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416)
at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
at java.base/java.nio.file.Files.move(Files.java:1432)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1136)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1148)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.execute(AutoConfigureNode.java:687)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
at org.elasticsearch.cli.Command.main(Command.java:77)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.main(AutoConfigureNode.java:157)
Extract interesting files from container (Prerequisite: All C:\Docker
to file sharing accept list)
> docker run --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v "C:\Docker":/mnt/local --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0 bash
elasticsearch@9d37e1eb7777:~$ cp /usr/share/elasticsearch/config/elasticsearch.yml /mnt/local/elasticsearch.yml
elasticsearch@9d37e1eb7777:~$ cp /usr/share/elasticsearch/config/elasticsearch.yml /mnt/local/elasticsearch2.yml
elasticsearch@9d37e1eb7777:~$ cp /usr/local/bin/docker-entrypoint.sh /mnt/local/docker-entrypoint.sh
elasticsearch@9d37e1eb7777:~$ cp /usr/share/elasticsearch/bin/elasticsearch /mnt/local/elasticsearch
Start in bash as root user, switch to elasticsearch, manually run docker-entrypoint.sh
to reproduce the original error
> docker run -u root --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v C:\Docker\elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v C:\Docker\elasticsearch2.yml:/usr/share/elasticsearch/config/elasticsearch2.yml --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0 bash
root@62b736fca663:/usr/share/elasticsearch# ls -l /usr/share/elasticsearch/config/elasticsearch*.yml
-rw-rw-r-- 1 root root 1042 Feb 3 16:47 /usr/share/elasticsearch/config/elasticsearch-plugins.example.yml
-rwxr-xr-x 1 root root 53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch.yml
-rwxr-xr-x 1 root root 53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch2.yml
root@62b736fca663:/usr/share/elasticsearch# df -a | grep elasticsearch
grpcfuse 998896636 190624520 808272116 20% /usr/share/elasticsearch/config/elasticsearch.yml
grpcfuse 998896636 190624520 808272116 20% /usr/share/elasticsearch/config/elasticsearch2.yml
root@62b736fca663:/usr/share/elasticsearch# su - elasticsearch
elasticsearch@62b736fca663:~$ /usr/local/bin/docker-entrypoint.sh
Exception in thread "main" java.nio.file.FileSystemException: /usr/share/elasticsearch/config/elasticsearch.yml.JrtBhUSPQ4eNKgiJ3atKQQ.tmp -> /usr/share/elasticsearch/config/elasticsearch.yml: Device or resource busy
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:100)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:416)
at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267)
at java.base/java.nio.file.Files.move(Files.java:1432)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1136)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.fullyWriteFile(AutoConfigureNode.java:1148)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.execute(AutoConfigureNode.java:687)
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:77)
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:112)
at org.elasticsearch.cli.Command.main(Command.java:77)
at org.elasticsearch.xpack.security.cli.AutoConfigureNode.main(AutoConfigureNode.java:157)
elasticsearch@62b736fca663:~$ ls -l /usr/share/elasticsearch/config/elasticsearch*.yml
-rw-rw-r-- 1 root root 1042 Feb 3 16:47 /usr/share/elasticsearch/config/elasticsearch-plugins.example.yml
-rwxr-xr-x 1 elasticsearch elasticsearch 53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch.yml
-rwxr-xr-x 1 root root 53 Mar 29 19:01 /usr/share/elasticsearch/config/elasticsearch2.yml
Check elasticsearch.yml ownership and permissions before and after manually running docker-entrypoint.sh.
>docker run -u root --name elastic1 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" --rm -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0 bash
root@40b71bc4c3ae:/usr/share/elasticsearch# ls -l /usr/share/elasticsearch/config/elasticsearch.yml
-rw-rw-r-- 1 root root 53 Feb 3 22:53 /usr/share/elasticsearch/config/elasticsearch.yml
root@40b71bc4c3ae:/usr/share/elasticsearch# su - elasticsearch
elasticsearch@40b71bc4c3ae:~$ /usr/local/bin/docker-entrypoint.sh > /dev/null 2> /dev/null &
[1] 18
elasticsearch@40b71bc4c3ae:~$ ls -l /usr/share/elasticsearch/config/elasticsearch.yml
-rw-rw-r-- 1 elasticsearch elasticsearch 1106 Mar 29 20:47 /usr/share/elasticsearch/config/elasticsearch.yml
If the operator does not mount elasticsearch.yml, I assume they want elasticsearch.yml autoconfiguration. If the operator mounts elasticsearch.yml, I assume they don't want elasticsearch.yml autoconfiguration.
From looking at the startup scripts, I don't see an option to skip autoconfiguration. The only way seems to be if ENROLLMENT_TOKEN is set.
/usr/local/bin/docker-entrypoint.sh
looks for it and calls /usr/share/elasticsearch/bin/elasticsearch --enrollment-token $ENROLLMENT_TOKEN
./usr/share/elasticsearch/bin/elasticsearch
only skips autoconfiguration (i.e. ATTEMPT_SECURITY_AUTO_CONFIG=false) if one of these parameters are present: --enrollment-token, --help, -h, --version, or -vNote that in addition to elasticsearch, kibana actually overwrites the configuration file to write content. So in fact, should the initialization file be separated from the actual configuration file like the .conf.d
file, such as adding a concept of elasticsearch-d.yml
to be responsible for initialization?
If the operator does not mount elasticsearch.yml, I assume they want elasticsearch.yml autoconfiguration. If the operator mounts elasticsearch.yml, I assume they don't want elasticsearch.yml autoconfiguration.
If you're proposing this should be the logic we use in the auto-configuration, I concur.
Should the same logic extend to the config directory?
If the operator mounts elasticsearch.yml, I assume they don't want elasticsearch.yml autoconfiguration.
I'd just like to add that this is not always the case. Whether, we should accept that as a limitation and work with this is another topic ( which I probably also agree with ) but for instance, on both cases this was reported in the forums, the users wanted to set a specific value (i.e. network.host to affect the SANs of the HTTP certificate ) but take advantage of the security features
We briefly discussed this today in our weekly sync.
There was consensus that mounting only the elasticsearch.yml
file, but leaving the rest of the config directory on the docker container, is not a configuration that works well with Security auto-configuration (primarily because persisting only the generated yml file, without the associated keystore and certs, is not useful for subsequent container runs).
I have taken an action item to investigate what is the consistent way to react to such a configuration, from starting without security auto-conf, or not starting at all. I'll assign this to me.
Got this issue with version 8.1.2
.
Having a specific configuration file elasticsearch.yml
is simplier to handle than defining all the env variables in the docker-compose.yaml
that can be very verbose when using multiple docker services.
This is due to the container using elasticsearch:elasticsearch as the user. Docker containers are intended to run everything via root:root.
All you need to do is set the ownerID and groupID of the directories being mounted to 1000:1000
ex:
- name: Create elk directory if it does not exist
ansible.builtin.file:
path: /opt/elk/{{ item.name }}
state: directory
mode: '0755'
owner: "{{ item.oid }}"
group: "{{ item.gid }}"
with_items:
- { name: "elasticsearch/config", oid: 1000, gid: 1000}
- { name: "elasticsearch/data", oid: 1000, gid: 1000}
- { name: "kibana/config", oid: 1000, gid: 1000}
- { name: "kibana/data", oid: 1000, gid: 1000}
become: yes
Hi all, There is some progress with this bug ? got this issue with version 8.3.2. Setting the ownerID and groupID of the mounted directories to 1000:1000 not resolving to issue.
I am using env var, instead of mounting elasticsearch.yml
. For example, I add ELASTICSEARCH_FS_SNAPSHOT_REPO_PATH=/mnt/backup
in order to setup snapshot repo.
I have taken an action item to investigate what is the consistent way to react to such a configuration, from starting without security auto-conf, or not starting at all.
@albertzaharovits did you get anywhere with this?
My feeling is that we should do something like (if we determine auto-configuration is needed)
elasticsearch.yml
and elasticsearch.keystore
if not, then we can assume that auto-configuration will do the wrong thing (that is, it would write files to 2 or more different mount points, leading to one or both being orphaned). In that case we should cleanup the temp file and skip the rest of auto configuration. We can probably just check that the output from findmnt --noheadings --output TARGET --target ${file}
is the same for all 3 files (temp, yml, keystore)We should talk about whether to do that for all packaging types, or just for docker.
Bumping this as this causes issues when trying to run the elasticsearch container as a rootless container using systemd.
I have tried to copy some files (the certs and .yml and .keystore) and bind mount them, and then adding -e ATTEMPT_SECURITY_AUTO_CONFIG=false
to podman run
, but I could not get the correct enrollment token.
I would very much like to have all the security bells and whistles autoconfigured for me + persistent storage :)
I think I got it working; rootless containers running at boot without having the user having to log in. Here is a little write up. Hopefully you guys can make this a bit easier!
cat /etc/*-release
Rocky Linux release 8.6 (Green Obsidian)
NAME="Rocky Linux"
VERSION="8.6 (Green Obsidian)"
Initial start to generate some files:
podman run --name es01 --net elastic -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v /podman/elasticsearch/data:/usr/share/elasticsearch/data:Z -it docker.elastic.co/elasticsearch/elasticsearch:8.3.3
Ctrl + C to quit
cd ~/podman/elasticsearch/config
podman cp es01:/usr/share/elasticsearch/config/elasticsearch.yml .
podman cp es01:/usr/share/elasticsearch/config/elasticsearch.keystore .
mkdir ~/podman/elasticsearch/config/certs
cd certs
podman cp es01:/usr/share/elasticsearch/config/certs/http.p12 .
podman cp es01:/usr/share/elasticsearch/config/certs/transport.p12 .
podman cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt .
podman stop es01
podman rm es01
rm -rf /podman/elasticsearch/data/*
Let us bind mount:
podman run --name es01 --net elastic -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e ATTEMPT_SECURITY_AUTO_CONFIG=false -v ~/podman/elasticsearch/config/certs/http.p12:/usr/share/elasticsearch/config/certs/http.p12:Z -v ~/podman/elasticsearch/config/certs/transport.p12:/usr/share/elasticsearch/config/certs/transport.p12:Z -v ~/podman/elasticsearch/config/certs/http_ca.crt:/usr/share/elasticsearch/config/certs/http_ca.crt:Z -v ~/podman/elasticsearch/config/elasticsearch.keystore:/usr/share/elasticsearch/config/elasticsearch.keystore:Z -v ~/podman/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml:Z -v /podman/elasticsearch/data:/usr/share/elasticsearch/data:Z -dt docker.elastic.co/elasticsearch/elasticsearch:8.3.3
Let us get the enrollment token (a lot of errors here, but it spits out the code in the end):
podman exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana
Let us start Kibana and use our enrollment procedure visiting the website:5601
and grabbing the code from terminal:
podman run --name kib-01 --net elastic -p 5601:5601 -v ~/podman/kibana/data/:/usr/share/kibana/data/:Z docker.elastic.co/kibana/kibana:8.3.3
CTLR + C to stop kibana.
But, let us start it again so we can grab the kibana.yml
configuration file:
podman start kib-01
mkdir ~/podman/kibana/config
cd ~/podman/kibana/config
podman cp kib-01:/usr/share/kibana/config/kibana.yml .
Stop it, using podman stop kib-01
.
Let us remove this file:
rm ~/podman/kibana/data/uuid
This will be our final run command for Kibana:
podman run --name kib-01 --net elastic -p 5601:5601 -v ~/podman/kibana/config/kibana.yml:/usr/share/kibana/config/kibana.yml:Z -v ~/podman/kibana/data/:/usr/share/kibana/data:Z -e SERVER_PUBLICBASEURL=http://192.168.10.44 -dt docker.elastic.co/kibana/kibana:8.3.3
Now I have working persistent configuration and I can generate systemd unit files (??). Let us also stop the containers and remove them:
cd ~/.config/systemd/user
podman generate systemd --new --files --name es01
podman generate systemd --new --files --name kib-01
podman stop kib-01
podman rm kib-01
podman stop es01
podman rm es01
Using systemctl
for now:
systemctl --user enable --now container-es01.service
systemctl --user enable --now container-kib-01.service
But hey, what about passwords?
This throws a lot of errors; although in the end it works and gives me a valid password for the elastic
user:
podman exec -it es01 /usr/share/elasticsearch/bin/elasticsearch-reset-password -u elastic
I can reboot the host computer and everything works without having to log in (loginctl enable-linger
).
The transport is now SSL encrypted, I have all the bells and whistles offered from the auto-configuration?
I found that if I explicitly set xpack.security.enabled: true
and bind mount a keystore that has a bootstrap.password
set, then bind mounting the elasticsearch.yml works fine. I haven't dug into the details of why or if that is correct behavior, but that is what I have observed.
Here is very simple single node cluster with a bind mounted elasticsearch.yml and keystore : https://github.com/jakelandis/es-docker-simple
I found that if I explicitly set
xpack.security.enabled: true
and bind mount a keystore that has abootstrap.password
set, then bind mounting the elasticsearch.yml works fine. I haven't dug into the details of why or if that is correct behavior, but that is what I have observed.Here is very simple single node cluster with a bind mounted elasticsearch.yml and keystore : https://github.com/jakelandis/es-docker-simple
setting xpack.security.enabled: true
in the custom elasticsearch.yaml fixed it for me , now its get mounted ,
I found that if I explicitly set
xpack.security.enabled: true
and bind mount a keystore that has abootstrap.password
set, then bind mounting the elasticsearch.yml works fine. I haven't dug into the details of why or if that is correct behavior, but that is what I have observed.
This is expected because enabling security explicitly makes the startup process skip security auto-configuration. The original error was thrown during security auto-configuration. Since it is skipped, the error no longer happens. But I believe the intention for this issue is whether we could either (1) detect the original bind mount situation and automatically skip auto configuration (IIUC, this is our preference) or (2) have auto configuration work if the the bind mount meets certain requirements.
So.. no fix for this yet?
Anyways, try my workaround...
On your docker command line: -v /absolute/path/to/a/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml Make sure the volume file "/absolute/path/to/a/elasticsearch.yml" exists and is writable.
Also, elasticsearch.yml should not be empty.
My example configuration:
cluster.name: "docker-cluster" network.host: 0.0.0.0 xpack.license.self_generated.type: trial xpack.security.enabled: true
Due to i only used for localhost,
version: '3.9'
services:
elasticsearch:
container_name: elasticsearch
image: elasticsearch:8.5.2
environment:
- TZ=Etc/GMT-8
- discovery.type=single-node
- ES_JAVA_OPTS=-Xmx256M
deploy:
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
window: 5s
resources:
limits:
cpu: 1
memory: 2G
ulimits:
nofile:
soft: 65535
hard: 65535
sysctls:
- net.ipv6.conf.all.disable_ipv6=1
- net.ipv6.conf.default.disable_ipv6=1
- net.ipv6.conf.lo.disable_ipv6=1
- net.ipv4.conf.all.rp_filter=0
- net.ipv4.conf.default.rp_filter=0
- net.ipv4.conf.default.arp_announce=2
- net.ipv4.conf.lo.arp_announce=2
- net.ipv4.conf.all.arp_announce=2
- net.ipv4.tcp_max_tw_buckets=5000
- net.ipv4.tcp_syncookies=1
- net.ipv4.tcp_max_syn_backlog=2048
- net.core.somaxconn=51200
- net.ipv4.tcp_synack_retries=2
- net.ipv4.tcp_fastopen=3
dns:
- 223.5.5.5
- 223.6.6.6
- 1.1.1.1
- 1.0.0.1
- 8.8.8.8
- 8.8.4.4
ports:
- target: 9200
published: 9200
protocol: tcp
mode: host
volumes:
- type: bind
source: /www/server/elasticsearch/config/elasticsearch.yml
target: /usr/share/elasticsearch/config/elasticsearch.yml
- type: bind
source: /www/server/elasticsearch/data
target: /usr/share/elasticsearch/data
- type: bind
source: /www/server/elasticsearch/plugins
target: /usr/share/elasticsearch/plugins
healthcheck:
disable: true
networks:
default:
name: podman
external: true
then the config
cluster.name: docker-cluster
network.host: 0.0.0.0
xpack.security.enabled: false
works
but the config
cluster.name: docker-cluster
network.host: 0.0.0.0
got this error.
Then i think https://github.com/elastic/elasticsearch/issues/85463#issuecomment-1229264396 Is correct!
god, who broke this? on 8.6.1 we aren't getting these errors.
elasticsearch-1 | Could not rename log file 'logs/gc.log' to 'logs/gc.log.03' (Permission denied). elasticsearch-1 | {"@timestamp":"2023-06-06T09:41:21.503Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"9a8a73e358ed","elasticsearch.cluster.name":"elasticsearch","error.type":"java.lang.IllegalStateException","error.message":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.env.NodeEnvironment.
(NodeEnvironment.java:291)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.node.Node. (Node.java:483)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.node.Node. (Node.java:327)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.bootstrap.Elasticsearch$2. (Elasticsearch.java:216)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:216)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.env.NodeEnvironment$NodeLock. (NodeEnvironment.java:236)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.env.NodeEnvironment$NodeLock. (NodeEnvironment.java:204)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.env.NodeEnvironment. (NodeEnvironment.java:283)\n\t... 5 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:833)\n\tat org.apache.lucene.core@9.6.0/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)\n\tat org.apache.lucene.core@9.6.0/org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:43)\n\tat org.apache.lucene.core@9.6.0/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)\n\tat org.elasticsearch.server@8.8.0/org.elasticsearch.env.NodeEnvironment$NodeLock. (NodeEnvironment.java:229)\n\t... 7 more\n\tSuppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\t\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\t\tat java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:261)\n\t\tat java.base/java.nio.file.Files.newByteChannel(Files.java:379)\n\t\tat java.base/java.nio.file.Files.createFile(Files.java:657)\n\t\tat org.apache.lucene.core@9.6.0/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)\n\t\t... 10 more\n"}
Running into the same problems. Was working fine the whole time with docker-compose, and suddenly when having killed the container and restarting it, I'm getting these errors:
bm-elasticsearch-poc-elastic-1 | {"@timestamp":"2023-07-26T08:04:29.450Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsea
rch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"elastic-0","elasticsearch.cluster.name":"biz","error.type":"java.lang.IllegalStateException","error.mes
sage":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: faile
d to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat org.elasticsearch.server@8.7.1/org.elasticsearch.env.NodeEnvironme
nt.<init>(NodeEnvironment.java:291)\n\tat org.elasticsearch.server@8.7.1/org.elasticsearch.node.Node.<init>(Node.java:480)\n\tat org.elasticsearch.server@8.7.1/org.elasticsearch.node.Node.<init>(Node.java:324)\n\tat org.elastics
earch.server@8.7.1/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:216)\n\tat org.elasticsearch.server@8.7.1/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:216)\n\tat org.elasticsea
rch.server@8.7.1/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat org.elasticsearch.server@8.7.1/org.elasticsearc
h.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:236)\n\tat org.elasticsearch.server@8.7.1/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:204)\n\tat org.elasticsearch.server@8.7.1/org.elasti
csearch.env.NodeEnvironment.<init>(NodeEnvironment.java:283)\n\t... 5 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(Un
ixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixPath
.toRealPath(UnixPath.java:833)\n\tat org.apache.lucene.core@9.5.0/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)\n\tat org.apache.lucene.core@9.5.0/org.apache.lucene.store.FSLockFactory.obt
ainLock(FSLockFactory.java:43)\n\tat org.apache.lucene.core@9.5.0/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)\n\tat org.elasticsearch.server@8.7.1/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>
(NodeEnvironment.java:229)\n\t... 7 more\n\tSuppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\t\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)\n\t\ta
t java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\t\tat java.base/sun.nio.fs.UnixFileSystemProvider.newByt
eChannel(UnixFileSystemProvider.java:261)\n\t\tat java.base/java.nio.file.Files.newByteChannel(Files.java:379)\n\t\tat java.base/java.nio.file.Files.createFile(Files.java:657)\n\t\tat org.apache.lucene.core@9.5.0/org.apache.luce
ne.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)\n\t\t... 10 more\n"}
This is the docker-compose
:
version: '2.2'
services:
elastic:
build:
context: ./
dockerfile: docker/elasticsearch/Dockerfile
privileged: true
environment:
- cluster.name=biz
- node.name=elastic-0
- xpack.security.enabled=true
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
mem_limit: 4g
cap_add:
- IPC_LOCK
volumes:
- ./docker/_data/elasticsearch:/usr/share/elasticsearch/data
ports:
- "9200:9200"
healthcheck:
test: ["CMD", "curl","-s" ,"-f", "http://localhost:9200/_cat/health"]
retries: 10
networks:
- biz
kibana:
image: docker.elastic.co/kibana/kibana:8.7.1
container_name: kibana
privileged: true
ports:
- "5601:5601"
healthcheck:
test: ["CMD", "curl", "-s", "-f", "http://localhost:5601/"]
retries: 10
depends_on:
elastic:
condition: service_healthy
environment:
- "ELASTICSEARCH_HOSTS=http://elastic:9200"
networks:
- biz
app:
build:
context: .
dockerfile: docker/app/Dockerfile
args:
- WITH_XDEBUG=true
environment:
- DEBUG=true
- PHP_IDE_CONFIG=serverName=app
# - XDEBUG_CONFIG=remote_host=172.32.0.1 remote_port=9001
ports:
- '80:80'
volumes:
- './:/var/www/html'
networks:
- biz
networks:
biz:
name: biz
driver: bridge
Nothing has changed whatsoever, and this was working just fine.
Even rebuilding the images doesn't solve the issue. No idea why it can't create the desired lock file, even though I see locally my _data/elasticsearch
folder being created from the running container.
facing the same issue,has anyone got a solution regarding this ?
facing the same issue,has anyone got a solution regarding this ?
The easiest way to work around this issue, originally reported by me at https://discuss.elastic.co/t/300981 , is to follow the instructions in Ioannis Kakavas's reply to the thread.
Two years ago, I didn’t expect this issue to persist for a long time.
where can i find a docker-compose and it's related config files that actually works?i followed the one present at the elasticsearch's offiicial installation guide but i get logs like :
elasticsearch_container | {"@timestamp":"2024-03-27T13:06:55.893Z", "log.level": "WARN", "data_stream.dataset":"deprecation.elasticsearch","data_stream.namespace":"default","data_stream.type":"logs","elasticsearch.event.category":"settings","event.code":"xpack.monitoring.collection.enabled","message":"[xpack.monitoring.collection.enabled] setting was deprecated in Elasticsearch and will be removed in a future release." , "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"deprecation.elasticsearch","process.thread.name":"main","log.logger":"org.elasticsearch.deprecation.common.settings.Settings","elasticsearch.node.name":"12f9b075322d","elasticsearch.cluster.name":"docker-cluster"}
elasticsearch_container | {"@timestamp":"2024-03-27T13:06:55.910Z", "log.level":"ERROR", "message":"fatal exception while booting Elasticsearch", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"main","log.logger":"org.elasticsearch.bootstrap.Elasticsearch","elasticsearch.node.name":"12f9b075322d","elasticsearch.cluster.name":"docker-cluster","error.type":"java.lang.IllegalStateException","error.message":"failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?","error.stack_trace":"java.lang.IllegalStateException: failed to obtain node locks, tried [/usr/share/elasticsearch/data]; maybe these locations are not writable or multiple nodes were started on the same data path?\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:294)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.node.Node.<init>(Node.java:499)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.node.Node.<init>(Node.java:344)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:236)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:236)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:73)\nCaused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:239)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:206)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:286)\n\t... 5 more\nCaused by: java.nio.file.NoSuchFileException: /usr/share/elasticsearch/data/node.lock\n\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\tat java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:834)\n\tat org.apache.lucene.core@9.8.0/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:94)\n\tat org.apache.lucene.core@9.8.0/org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:43)\n\tat org.apache.lucene.core@9.8.0/org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:44)\n\tat org.elasticsearch.server@8.11.0/org.elasticsearch.env.NodeEnvironment$NodeLock.<init>(NodeEnvironment.java:232)\n\t... 7 more\n\tSuppressed: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/node.lock\n\t\tat java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)\n\t\tat java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)\n\t\tat java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:261)\n\t\tat java.base/java.nio.file.Files.newByteChannel(Files.java:379)\n\t\tat java.base/java.nio.file.Files.createFile(Files.java:657)\n\t\tat org.apache.lucene.core@9.8.0/org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:84)\n\t\t... 10 more\n"}
elasticsearch_container | ERROR: Elasticsearch did not exit normally - check the logs at /usr/share/elasticsearch/logs/docker-cluster.log
elasticsearch_container |
elasticsearch_container |
elasticsearch_container | ERROR: Elasticsearch exited unexpectedly, with exit code 1
Don't only mount the elasticsearch.yml from host. Mount the whole config directory. E.g.
docker run --name oh-noes-this-fails -p 9200:9200 -v /absolute/path/to/config:/usr/share/elasticsearch/config -it docker.elastic.co/elasticsearch/elasticsearch:8.0.0
You will need to copy all the files under the /usr/share/elasticsearch/config folders to the host and customize it:
$ docker container run -it --rm elasticsearch:8.13.0 bash
elasticsearch@d9d90482601a:~$ cd /usr/share/elasticsearch/config/
elasticsearch@d9d90482601a:~/config$ ls -al
total 68
drwxrwxr-x 1 elasticsearch root 4096 Mar 26 18:49 .
drwxrwxr-x 1 root root 4096 Mar 26 18:49 ..
-rw-rw-r-- 1 root root 1042 Mar 22 03:34 elasticsearch-plugins.example.yml
-rw-rw-r-- 1 root root 53 Mar 26 18:49 elasticsearch.yml
-rw-rw-r-- 1 root root 2727 Mar 22 03:34 jvm.options
drwxrwxr-x 1 elasticsearch root 4096 Mar 22 03:37 jvm.options.d
-rw-rw-r-- 1 root root 17969 Mar 22 03:40 log4j2.file.properties
-rw-rw-r-- 1 root root 12549 Mar 26 18:49 log4j2.properties
-rw-rw-r-- 1 root root 473 Mar 22 03:40 role_mapping.yml
-rw-rw-r-- 1 root root 197 Mar 22 03:40 roles.yml
-rw-rw-r-- 1 root root 0 Mar 22 03:40 users
-rw-rw-r-- 1 root root 0 Mar 22 03:40 users_roles
The hint of the solution came from the error itself:
Exception in thread "main" java.nio.file.FileSystemException: /usr/share/elasticsearch/config/elasticsearch.yml.R0_9BZ4hRx-v8zK3F0U-Bw.tmp -> /usr/share/elasticsearch/config/elasticsearch.yml: Device or resource busy at java.base/java.nio.file.Files.move(Files.java:1432)
The server tried to move a temporarily created file to a file location mounted to the host. Since the file location was mounted to the host, the server could not remove the file and threw the exception.
Bumping this issue since it seems fairly essential for containerized environments yet 2 years later no realistic solution or workaround has been identified.
@kgrozdanovski what @itbill wrote works. Also in the first launch it will create the defaults conf files so you don't need to copy all the files from the container to the host volume.
@kgrozdanovski what @itbill wrote works.
Also in the first launch it will create the defaults conf files so you don't need to copy all the files from the container to the host volume.
itbill suggested a workaround which is not a clean solution, nor is it documented anywhere. Furthermore what you are suggesting contradicts his comment since he notes you must copy all config files into the directory you are binding.
TLDR; there is still no real solution.
After reading the docs here is the fix for the busy errror (they talk about the keystore but it's the same) https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#_elasticsearch_keystore_device_or_resource_busy
After reading the docs here is the fix for the busy errror (they talk about the keystore but it's the same) https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#_elasticsearch_keystore_device_or_resource_busy
It's not quite the same, es requires other files in config directory to run. If I mount an empty config directory for it to generate stuff in, it gives me another error ERROR: Missing logging config file at /usr/share/elasticsearch/config/log4j2.properties, with exit code 78
The following works for me.
#!/bin/bash
sudo rm -rf ./elastic/data/
sudo rm -rf ./elastic/logs/
sudo mkdir -p ./elastic/data
sudo mkdir -p ./elastic/logs
sudo chmod -R 777 ./elastic/data/
sudo chmod -R 777 ./elastic/logs/
sudo chown -R 1000:1000 ./elastic/data/
sudo chown -R 1000:1000 ./elastic/logs/
services:
# https://hub.docker.com/_/elasticsearch
elastic:
image: docker.elastic.co/elasticsearch/elasticsearch:8.14.2
restart: unless-stopped
ports:
- 127.0.0.1:9200:9200
environment:
network.host: 0.0.0.0
discovery.type: single-node
bootstrap.memory_lock: true
xpack.security.enabled: false
ingest.geoip.downloader.enabled: false
logger.org.elasticsearch: ERROR
logger.com.azure.core: ERROR
logger.org.apache: ERROR
ES_JAVA_OPTS: -Xms1g -Xmx1g
ELASTIC_PASSWORD: ${ELASTIC_PASSWORD:?error}
volumes:
- ./elastic/data:/usr/share/elasticsearch/data
- ./elastic/logs:/usr/share/elasticsearch/logs
ulimits:
memlock:
soft: -1
hard: -1
nproc:
soft: 65536
hard: 65536
nofile:
soft: 65536
hard: 65536
cap_add:
- IPC_LOCK
healthcheck:
test:
[
"CMD-SHELL",
"curl --fail --silent http://localhost:9200/_cluster/health",
]
interval: 10s
timeout: 10s
retries: 120
networks:
- my-network
networks:
my-network:
name: my-network
hey guys do you think this gonna get fixed anytime soon?
Elasticsearch Version
Installed Plugins
No response
Java Version
bundled
OS Version
N/A
Problem Description
Elasticsearch fails to start when
elasticsearch.yml
is bind mount to a file on the host with a "Device or resource busy' error. This was possibly introduced with the changes for the autoconfiguration of the security features and triggers when we attempt to write the configuration to theelasticsearch.yml
file (AutoConfigureNode#fullyWriteFile
)Steps to Reproduce
or
fails with
Logs (if relevant)
No response