oittaa / gcp-storage-emulator

Local emulator for Google Cloud Storage
BSD 3-Clause "New" or "Revised" License
153 stars 42 forks source link

Downloading blobs not working #113

Open JeffryCA opened 2 years ago

JeffryCA commented 2 years ago

Setup using the docker image oittaa/gcp-storage-emulator (docker-compose).

...
storage:
    image: oittaa/gcp-storage-emulator
    volumes:
      - ./.cloudstorage:/storage
    ports:
      - 9023:9023
    environment:
      - PORT=9023

and tests:

@pytest.fixture(scope="session", autouse=True)
def setup_local_storage_emulator():
  client = storage.Client(
          credentials=AnonymousCredentials(),
          project="test",
      )
  # fill buckets with test data
  try:
      bucket = client.create_bucket(TEST_BUCKET)
  except Conflict:
      bucket = client.get_bucket(TEST_BUCKET)

  blob = bucket.blob("blob1")
  blob.upload_from_string("test1")
  blob = bucket.blob("blob2")
  blob.upload_from_string("test2")
  # up to here everything runs
  for blob in bucket.list_blobs():
      content = blob.download_as_bytes()
      # this step fails
      print("Blob [{}]: {}".format(blob.name, content))

The error I get after a while is the following:

requests.exceptions.ConnectionError: HTTPConnectionPool(host='0.0.0.0', port=9023): Max retries exceeded with url: /download/storage/v1/b/TEST_BUCKET/o/blob1?generation=1640043791&alt=media (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb2434a7460>: Failed to establish a new connection: [Errno 111] Connection refused'))

Is there something

oittaa commented 2 years ago

I don't use Docker Compose, so I don't really know what's going on. If you'll provide a minimal test case, I could try to reproduce it at some point.

FAR5HID commented 2 years ago

Ran into the same issue. Did you find anything later? @JeffryCA

Gradecak commented 1 year ago

I ran into the same issue today, and after some investigation I believe I've discovered the cause.

The download process first resolves the blob to an object resource and then uses the mediaLink property to determine where to download the resource from.

Unfortunately, the mediaLink is constructed from the base_url of the server, which is usually 0.0.0.0 when running the emulator inside of docker-compose. Since docker containers in docker-compose are generally addressed via DNS, its likely that you have your STORAGE_EMULATOR_HOST set to the name of the service where the emulator is running.

This in turn means that when you try to download the blob in the container different from the one where the emulator is running, you end up making the download request to 0.0.0.0 rather than the hostname of the container where the emulator service is running.

I think a relatively straightforward solution to this might be to allow for a STORAGE_HOST environment variable that defaults to DEFAULT_HOST if not provided. The mediaLink can then be constructed from STORAGE_HOST allowing it to work inside of the docker network.

I'd be happy to set this up if PR's are welcome on the project?

gabrieljoelc commented 8 months ago

I was able to get things working with a combination of https://github.com/oittaa/gcp-storage-emulator/issues/82#issue-990990857 and https://github.com/oittaa/gcp-storage-emulator/issues/82#issuecomment-915135545. looks like this works around the issue @Gradecak mentioned?