stellar / stellar-core

Reference implementation for the peer-to-peer agent that manages the Stellar network.
https://www.stellar.org
Other
3.13k stars 973 forks source link

Device or resource busy (Version: 19.2.0) #3463

Open piccadil opened 2 years ago

piccadil commented 2 years ago

I have an issue with mounting a docker volume to stellar-core node. I've written own dockerfile on top of the official image from docker hub and docker-compose file. After starting, stellar-core connects to db, but I'm getting an error:

2022-06-27T10:58:16.017 [default INFO] Config from /etc/stellar/stellar-core.cfg
2022-06-27T10:58:16.017 [default INFO] Generated QUORUM_SET: {
   "t" : 2,
   "v" : [ "sdf_testnet_2", "sdf_testnet_3", "sdf_testnet_1" ]
}

2022-06-27T10:58:16.019 GATVZ [default INFO] Application destructing
2022-06-27T10:58:16.019 GATVZ [default INFO] Application destroyed
2022-06-27T10:58:16.019 GATVZ [default FATAL] Got an exception: filesystem error: in remove_all: Device or resource busy [/stellar/buckets/]
2022-06-27T10:58:16.019 GATVZ [default FATAL] Please report this bug along with this log file if this was not expected

Dockerfile:

ARG STELLAR_CORE_VERSION=19.2.0-966.d18d54aa3.focal
FROM stellar/stellar-core:${STELLAR_CORE_VERSION}
ARG ENV=test
ADD setup /
RUN chmod +x /setup && /setup
COPY stellar-core-test.cfg /stellar/stellar-core.cfg
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/bin/bash","/entrypoint.sh"]

entrypoint.sh :

#!/bin/bash
stellar-core --conf /stellar/stellar-core.cfg new-db
echo "Init finished"
exec "$@"

docker-compose file:

version: '3'
services:
  postgresql-core:
    image: postgres:11.5-alpine
    environment:
      - POSTGRES_LOGGING=true
      - POSTGRES_DB=core
      - POSTGRES_PASSWORD_FILE=/run/secrets/core_postgres_password
      - POSTGRES_USER=stellar
    secrets:
      - core_postgres_password
    ports:
      - 15432:5432
    volumes:
      - /home/user/docker-vols/stellar/postgres:/var/lib/postgresql/data
    restart: on-failure
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U stellar -d core"]
      interval: 10s
      timeout: 5s
      retries: 5
  stellar-core:
    image: stellar-local
    build:
      dockerfile: /home/user/blockchain/blockchain-node-stellar/stellar-core.Dockerfile
      context: ./artifacts/test
    container_name: stellar
    command:
      stellar-core
      --conf /stellar/stellar-core.cfg
      run
    volumes:
      - /home/user/docker-vols/stellar/core/:/stellar/buckets/
    ports:
      - 11626:11626
    depends_on:
      postgresql-core:
        condition: service_healthy
    secrets:
      - core_postgres_password
secrets:
  core_postgres_password:
    file: /home/user/docker-vols/stellar/secrets/core_postgres_password

stellar config:

HTTP_PORT=11626
PUBLIC_HTTP_PORT=true
LOG_FILE_PATH=""
MANUAL_CLOSE=false

NETWORK_PASSPHRASE="Test SDF Network ; September 2015"
KNOWN_CURSORS=["HORIZON"]
DATABASE="postgresql://dbname=core host=postgresql-core user=stellar password=RA5NFxNDrmpuC3t2"
UNSAFE_QUORUM=true
FAILURE_SAFETY=1
CATCHUP_RECENT=100

BUCKET_DIR_PATH="/stellar/buckets/"

[[HOME_DOMAINS]]
HOME_DOMAIN="testnet.stellar.org"
QUALITY="HIGH"

[[VALIDATORS]]
NAME="sdf_testnet_1"
HOME_DOMAIN="testnet.stellar.org"
PUBLIC_KEY="GDKXE2OZMJIPOSLNA6N6F2BVCI3O777I2OOC4BV7VOYUEHYX7RTRYA7Y"
ADDRESS="core-testnet1.stellar.org"
HISTORY="curl -sf http://history.stellar.org/prd/core-testnet/core_testnet_001/{0} -o {1}"

[[VALIDATORS]]
NAME="sdf_testnet_2"
HOME_DOMAIN="testnet.stellar.org"
PUBLIC_KEY="GCUCJTIYXSOXKBSNFGNFWW5MUQ54HKRPGJUTQFJ5RQXZXNOLNXYDHRAP"
ADDRESS="core-testnet2.stellar.org"
HISTORY="curl -sf http://history.stellar.org/prd/core-testnet/core_testnet_002/{0} -o {1}"

[[VALIDATORS]]
NAME="sdf_testnet_3"
HOME_DOMAIN="testnet.stellar.org"
PUBLIC_KEY="GC2V2EFSXN6SQTWVYA5EPJPBWWIMSD2XQNKUOHGEKB535AQE2I6IXV2Z"
ADDRESS="core-testnet3.stellar.org"
HISTORY="curl -sf http://history.stellar.org/prd/core-testnet/core_testnet_003/{0} -o {1}"
MonsieurNicolas commented 2 years ago

There is probably a problem with those config files, so somebody with more docker experience should chime in.

Could also be a problem with the host?

You should probably try to add some diagnostic logging in entrypoint.sh. It looks like some problem with

    volumes:
      - /home/user/docker-vols/stellar/core/:/stellar/buckets/

I would try to replace it with something like

#!/bin/bash
ls -al /stellar/buckets/
mkdir /stellar/buckets/testfolder
touch /stellar/buckets/testfolder/somefile
rm -rf /stellar/buckets/testfolder
ls -al /stellar/buckets/
exec "$@"
piccadil commented 2 years ago

@MonsieurNicolas thanks for answer. I've a lot of other services and blockchain nodes running with docker volume. Please find a log of the new entrypoint.sh, that you provided earlier:

#!/bin/bash
ls -al /stellar/buckets/
mkdir /stellar/buckets/testfolder
touch /stellar/buckets/testfolder/somefile
rm -rf /stellar/buckets/testfolder
ls -al /stellar/buckets/
exec "$@"

The logs:

stellar            | total 8
stellar            | drwxrwxrwx 2 stellar stellar 4096 Jun 27 10:58 .
stellar            | drwxr-xr-x 1 root    root    4096 Jul  4 06:10 ..
stellar            | total 8
stellar            | drwxrwxrwx 2 stellar stellar 4096 Jul  4 06:10 .
stellar            | drwxr-xr-x 1 root    root    4096 Jul  4 06:10 ..
postgresql-core_1  | 2022-07-04 06:10:25.991 UTC [19] LOG:  database system was shut down at 2022-06-27 11:02:02 UTC
postgresql-core_1  | 2022-07-04 06:10:26.000 UTC [1] LOG:  database system is ready to accept connections
stellar            | 2022-07-04T06:10:36.673 [default INFO] Config from /stellar/stellar-core.cfg
stellar            | 2022-07-04T06:10:36.677 [default INFO] Generated QUORUM_SET: {
stellar            |    "t" : 2,
stellar            |    "v" : [ "sdf_testnet_2", "sdf_testnet_3", "sdf_testnet_1" ]
stellar            | }
stellar            | 
stellar            | 2022-07-04T06:10:36.685 GAHTJ [default INFO] Application destructing
stellar            | 2022-07-04T06:10:36.686 GAHTJ [default INFO] Application destroyed
stellar            | 2022-07-04T06:10:36.687 GAHTJ [default FATAL] Got an exception: filesystem error: in remove_all: Device or resource busy [/stellar/buckets/]
stellar            | 2022-07-04T06:10:36.687 GAHTJ [default FATAL] Please report this bug along with this log file if this was not expected
stellar exited with code 1
PierreEPS commented 11 months ago

Do we have any update for Device or resource busy when mount BUCKET_DIR_PATH using docker container. We are facing this on this version also: stellar-core 20.0.2 (669916b56106a72a7f79ab8b4a7898e77b28b49e)