GoogleCloudPlatform / gcsfuse

A user-space file system for interacting with Google Cloud Storage
https://cloud.google.com/storage/docs/gcs-fuse
Apache License 2.0
2.05k stars 426 forks source link

Running gcsfuse in a docker container (with google application credentials) mount hangs forever unless using --foreground #711

Closed sereeena closed 2 years ago

sereeena commented 2 years ago

I have installed gcsfuse in a docker container (based on bullseye and using gcsfuse-buster repo), and I am running the docker container with --cap-add SYS_ADMIN --device /dev/fuse --security-opt apparmor:unconfined to allow fuse to run.

I have set up a service account on GCP with cloud storage permissions, have downloaded a json key for it and use this with GOOGLE_APPLICATION_CREDENTIALS when calling gcsfuse.

Running interactively in the docker container I managed to mount a bucket with gcsfuse the first time, but subsequently whenever I try to mount it hangs here:

GOOGLE_APPLICATION_CREDENTIALS=ci-account.json gcsfuse --debug_fuse --debug_fs --debug_gcs --debug_http prs-reference-files /share/referencefiles
2022/07/07 08:08:37.198464 Start gcsfuse/0.41.4 (Go version go1.17.6) for app "" using mount point: /share/referencefiles
2022/07/07 08:08:37.215536 Opening GCS connection...   

However, if I run with --foreground it reports that the mount was successful:

GOOGLE_APPLICATION_CREDENTIALS=ci-account.json gcsfuse --foreground prs-reference-files /share/referencefiles &
[1] 116
root@d3d514779697:/tests# 2022/07/07 08:05:49.549341 Start gcsfuse/0.41.4 (Go version go1.17.6) for app "" using mount point: /share/referencefiles
2022/07/07 08:05:49.549403 Opening GCS connection...
2022/07/07 08:05:49.549786 Creating a mount at "/share/referencefiles"

WARNING: gcsfuse invoked as root. This will cause all files to be owned by
root. If this is not what you intended, invoke gcsfuse as the user that will
be interacting with the file system.
2022/07/07 08:05:49.550168 Creating a new server...
2022/07/07 08:05:49.550180 Set up root directory for bucket prs-reference-files
2022/07/07 08:05:49.550185 OpenBucket("prs-reference-files", "")
2022/07/07 08:05:51.243185 Mounting file system "prs-reference-files"...
2022/07/07 08:05:51.244811 File system has been successfully mounted.

I suppose I can run it in this way, but it seems unusual to have to run it in the foreground and then immediately background it in order for it to work?

avidullu commented 2 years ago

Sorry for the late reponse. Some thoughts

a. Did you fusermount -u /share/referencefiles before retrying to mount the bucket? b. If you're not going to use the --log-file and --log-format, is there any reason to pass the debug_* flags? c. Could you share more about the running environment? The OS version being used etc.? So we can try to repro this d. Can you verify if you're not hitting this issue and maybe this can help?

I am guessing since the non foreground version runs a background process, there is some Docker based constraint which is resulting in the command hanging.

markkimsal commented 2 years ago

Same issue but not in a docker container.

 gcsfuse --debug_gcs --debug_http --debug_mutex --debug_fs --debug_fuse -o nonempty --implicit-dirs ftp-bucket /mnt/ftp-bucket

Hangs forever with the output

2022/07/15 21:00:56.970034 Start gcsfuse/0.41.4 (Go version go1.17.6) for app "" using mount point: /mnt/ftp-bucket
2022/07/15 21:00:56.989656 Opening GCS connection...

Adding the --foreground flag showed an error that I was not the owner of /mnt/ftp-bucket

sudo chown me /mnt/ftp-bucket

now it works, but still only with --foreground

This is on GCE host with Ubuntu 20.04.4 LTS

$ gcsfuse --v
gcsfuse version 0.41.4 (Go version go1.17.6)

$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.4 LTS"
avidullu commented 2 years ago

@markkimsal Based on this you could use nohup command. Docker seems to have some restrictions on bacground processes which need some getting around.

sereeena commented 2 years ago

I tried to use nohup command but this doesn't work either. Sorry for late responses, I had kind of given up on it. As I am not sure if I understand your suggestions or if it fixes things I thought I'd mention how to reproduce and what I have tried so far

a. Did you fusermount -u /share/referencefiles before retrying to mount the bucket?

No, as I don't think it successfully mounted it so did not need to unmount. However, calling fusermount command results in fusermount: failed to clone namespace: Operation not permitted

b. If you're not going to use the --log-file and --log-format, is there any reason to pass the debug_* flags?

No reason then

c. Could you share more about the running environment? The OS version being used etc.?

Here is a minimal Dockerfile:

FROM python:3.8-slim-bullseye

# install gcsfuse (buster repository is the same as bullseye which doesn't exist)
RUN apt-get update && apt-get -y upgrade \
    && apt-get install -y --no-install-recommends \
    curl \
    gnupg \
    && export GCSFUSE_REPO=gcsfuse-buster \
    && echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" | tee /etc/apt/sources.list.d/gcsfuse.list \
    && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - \
    && apt-get update && apt-get install -y gcsfuse \
    && rm -rf /var/lib/apt/lists/*

RUN mkdir "tests"
WORKDIR tests
COPY *.json ./

Then, from my linux mint machine, building then running it interactively: docker run -it --rm ci-runner /bin/bash These commands hang forever (at Opening GCS connection)

root@3eb0721b3350:/tests# mkdir -p /share/prs-reference-files
root@3eb0721b3350:/tests# export GOOGLE_APPLICATION_CREDENTIALS=ci-account.json
root@3eb0721b3350:/tests# gcsfuse prs-reference-files /share/prs-reference-files
2022/07/18 04:55:07.119270 Start gcsfuse/0.41.4 (Go version go1.17.6) for app "" using mount point: /share/prs-reference-files
2022/07/18 04:55:07.136060 Opening GCS connection...

Trying to add nohup just causes it to hang (at nohup: ignoring input)

root@3eb0721b3350:/tests# nohup gcsfuse prs-reference-files /share/prs-reference-files
nohup: ignoring input and appending output to 'nohup.out'

I realise this is not how you would run gcsfuse in a docker container in practice - you would have a script CMD or ENTRYPOINT but for the purposes of reproducing the problem I think this should be fine? Sorry if I have not understood how to use nohup but I couldn't get that to work.

avidullu commented 2 years ago

Thanks for the details @sereeena I meant having nohup alongwith the --foreground flag as well.

Also did you happen to look at the Dockerfile in the repo? https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/Dockerfile

sereeena commented 2 years ago

Thanks, I was already able to get gcsfuse to run with --foreground (see title of issue), so nohup is not necessary.

My intention had been to create a docker container which mounted a bucket with gcsfuse, then did other things like run an app or run unit tests, so there would be a script/entrypoint where it would call gcsfuse then continue with other things.

In order to do this, it seems I need to call gcsfuse with --foreground then background it? So it could then run further commands or an app with access to the bucket? If that is not a bug then that's fine, I just thought it was strange, or in any case it was not obvious that --foreground was necessary. Thanks

markkimsal commented 2 years ago

@markkimsal Based on this you could use nohup command. Docker seems to have some restrictions on bacground processes which need some getting around.

@avidullu I'm experiencing the same issue just on a bare VM, no docker involved (GCE, ubuntu 20.04 LTS)

The output of the command when hanging is different from other tickets, it matches the output of this ticket.

No matter how many --debug flags I use, it hangs right after

2022/07/07 08:08:37.215536 Opening GCS connection...  
avidullu commented 2 years ago

@markkimsal Is this issue seen on a fresh VM with just gcsfuse as the additional software installed? Are you the owner of the bucket?

markkimsal commented 2 years ago

fresh GCE, ubuntu 20.04 focal

export GCSFUSE_REPO=gcsfuse-`lsb_release -c -s`
echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" | sudo tee /etc/apt/sources.list.d/gcsfuse.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install -y gcsfuse
sudo mkdir /mnt/gcsfuse
sudo chown mark /mnt/gcsfuse

GOOGLE_APPLICATION_CREDENTIALS=service-account.json gcsfuse --debug_fuse --debug_fs --debug_gcs --debug_http bucket-test /mnt/gcsfuse/
2022/07/20 13:46:23.910864 Start gcsfuse/0.41.4 (Go version go1.17.6) for app "" using mount point: /mnt/gcsfuse
2022/07/20 13:46:23.922902 Opening GCS connection...

I don't know how to tell if this service-account is the bucket owner. I don't think it is, but this same service file can read and write to the bucket

GOOGLE_APPLICATION_CREDENTIALS=service-account.json gcsfuse --debug_fuse --debug_fs --debug_gcs --debug_http --foreground bucket-test /mnt/gcsfuse/

// ... tons of output ending with
http: 2022/07/20 13:48:09.242773 ====================
gcs: 2022/07/20 13:48:09.242922 Req              0x1: -> ListObjects("") (50.008904ms): OK
2022/07/20 13:48:09.242958 Mounting file system "bucket-test"...
fuse_debug: 2022/07/20 13:48:09.257287 Op 0x00000002        connection.go:416] <- init
fuse_debug: 2022/07/20 13:48:09.257328 Op 0x00000002        connection.go:498] -> OK ()
2022/07/20 13:48:09.257360 File system has been successfully mounted.
avidullu commented 2 years ago

Could you please retry with giving the full path to the credentials file? eg. /home/$USER/service.json ?

markkimsal commented 2 years ago

@avidullu

with full path and no foreground

2022/08/08 12:20:58.222103 Start gcsfuse/0.41.4 (Go version go1.17.6) for app "" using mount point: /mnt/gcsfuse 
2022/08/08 12:20:58.242950 Opening GCS connection...
2022/08/08 12:20:58.378301 Mounting file system "my-bucket"...
daemonize.Run: readFromProcess: sub-process: mountWithArgs: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1

with relative path and no foreground (either with ./ or without ./)

2022/08/08 12:22:17.732762 Opening GCS connection...
[hangs]
avidullu commented 2 years ago

@markkimsal Can you file a separate issue with the bug template? Since this was about Docker container and you aren't using one. Please provide the entire command line and the results of the same. I'm closing this issue since there are no actionable items for the original bug.

@sereeena Sorry the bug got side tracked and to your point regarding the necessity to use --foreground when running gcsfuse on a docker container -- That is correct afaik based on the little research I did regarding background processes running in docker container. This is not a gcsfuse issue albeit something that Docker may be able to help you with.