vincentsarago / MAXAR_opendata_to_pgstac

Create STAC Collections/Items for some AWS OpenData
MIT License
10 stars 1 forks source link

Loading sample items causes database connection to close #6

Open tylere opened 4 months ago

tylere commented 4 months ago

Loading collection.json works:

(eoapi-test) tylere@tylers-mbp eoapi-test % pypgstac load collections ~/Documents/GitHub/vincentsarago/MAXAR_opendata_to_pgstac/Maxar/collections.json --dsn postgresql://username:password@0.0.0.0:5439/postgis --method insert_ignore 

But loading items.json fails:

(eoapi-test) tylere@tylers-mbp eoapi-test % pypgstac load items ~/Documents/GitHub/vincentsarago/MAXAR_opendata_to_pgstac/Maxar/items.json --dsn postgresql://username:password@0.0.0.0:5439/postgis --method insert_ignore
Traceback (most recent call last):
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/bin/pypgstac", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/pypgstac/pypgstac.py", line 125, in cli
    fire.Fire(PgstacCLI)
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/pypgstac/pypgstac.py", line 76, in load
    loader.load_items(file, method, dehydrated, chunksize)
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/pypgstac/load.py", line 613, in load_items
    self.load_partition(self._partition_cache[k], g, insert_mode)
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/tenacity/__init__.py", line 326, in wrapped_f
    return self(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/tenacity/__init__.py", line 406, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/tenacity/__init__.py", line 351, in iter
    return fut.result()
           ^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/tenacity/__init__.py", line 409, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/pypgstac/load.py", line 282, in load_partition
    cur.execute(
  File "/Users/tylere/Documents/GitHub/tylere/eoapi-test/.pixi/envs/default/lib/python3.11/site-packages/psycopg/cursor.py", line 732, in execute
    raise ex.with_traceback(None)
psycopg.OperationalError: consuming input failed: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
discarding closed connection: <psycopg.Connection [BAD] at 0x10611a110>
docker-compose.yml ``` version: '3' services: # change to official image when available https://github.com/radiantearth/stac-browser/pull/386 stac-browser: build: context: dockerfiles dockerfile: Dockerfile.browser ports: - "${MY_DOCKER_IP:-127.0.0.1}:8085:8085" depends_on: - stac - raster - database stac: # Note: # the official ghcr.io/stac-utils/stac-fastapi-pgstac image uses python 3.8 and uvicorn # which is why here we use a custom Dockerfile using python 3.11 and gunicorn build: context: . dockerfile: dockerfiles/Dockerfile.stac ports: - "${MY_DOCKER_IP:-127.0.0.1}:8081:8081" environment: # Application - HOST=0.0.0.0 - PORT=8081 - MODULE_NAME=stac_fastapi.pgstac.app - VARIABLE_NAME=app # gunicorn # https://github.com/tiangolo/uvicorn-gunicorn-docker#web_concurrency - WEB_CONCURRENCY=10 # https://github.com/tiangolo/uvicorn-gunicorn-docker#workers_per_core # - WORKERS_PER_CORE=1 # https://github.com/tiangolo/uvicorn-gunicorn-docker#max_workers # - MAX_WORKERS=10 # Postgres connection - POSTGRES_USER=username - POSTGRES_PASS=password - POSTGRES_DBNAME=postgis - POSTGRES_HOST_READER=database - POSTGRES_HOST_WRITER=database - POSTGRES_PORT=5432 - DB_MIN_CONN_SIZE=1 - DB_MAX_CONN_SIZE=10 depends_on: - database command: bash -c "bash /tmp/scripts/wait-for-it.sh -t 120 -h database -p 5432 && /start.sh" volumes: - ./dockerfiles/scripts:/tmp/scripts raster: # At the time of writing, rasterio and psycopg wheels are not available for arm64 arch # so we force the image to be built with linux/amd64 platform: linux/amd64 image: ghcr.io/stac-utils/titiler-pgstac:0.5.1 ports: - "${MY_DOCKER_IP:-127.0.0.1}:8082:8082" environment: # Application - HOST=0.0.0.0 - PORT=8082 # https://github.com/tiangolo/uvicorn-gunicorn-docker#web_concurrency - WEB_CONCURRENCY=1 # https://github.com/tiangolo/uvicorn-gunicorn-docker#workers_per_core - WORKERS_PER_CORE=1 # https://github.com/tiangolo/uvicorn-gunicorn-docker#max_workers - MAX_WORKERS=10 # Postgres connection - POSTGRES_USER=username - POSTGRES_PASS=password - POSTGRES_DBNAME=postgis - POSTGRES_HOST=database - POSTGRES_PORT=5432 - DB_MIN_CONN_SIZE=1 - DB_MAX_CONN_SIZE=10 # - DB_MAX_QUERIES=10 # - DB_MAX_IDLE=10 # GDAL Config - CPL_TMPDIR=/tmp - GDAL_CACHEMAX=75% - GDAL_INGESTED_BYTES_AT_OPEN=32768 - GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR - GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES - GDAL_HTTP_MULTIPLEX=YES - GDAL_HTTP_VERSION=2 - VSI_CACHE=TRUE - VSI_CACHE_SIZE=536870912 # TiTiler Config - MOSAIC_CONCURRENCY=1 # AWS S3 endpoint config - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} depends_on: - database command: bash -c "bash /tmp/scripts/wait-for-it.sh -t 120 -h database -p 5432 && /start.sh" volumes: - ./dockerfiles/scripts:/tmp/scripts vector: image: ghcr.io/developmentseed/tipg:0.3.1 ports: - "${MY_DOCKER_IP:-127.0.0.1}:8083:8083" environment: # Application - HOST=0.0.0.0 - PORT=8083 # https://github.com/tiangolo/uvicorn-gunicorn-docker#web_concurrency - WEB_CONCURRENCY=10 # https://github.com/tiangolo/uvicorn-gunicorn-docker#workers_per_core # - WORKERS_PER_CORE=1 # https://github.com/tiangolo/uvicorn-gunicorn-docker#max_workers # - MAX_WORKERS=10 # Postgres connection - POSTGRES_USER=username - POSTGRES_PASS=password - POSTGRES_DBNAME=postgis - POSTGRES_HOST=database - POSTGRES_PORT=5432 - DB_MIN_CONN_SIZE=1 - DB_MAX_CONN_SIZE=10 command: bash -c "bash /tmp/scripts/wait-for-it.sh -t 120 -h database -p 5432 && /start.sh" depends_on: - database volumes: - ./dockerfiles/scripts:/tmp/scripts database: image: ghcr.io/stac-utils/pgstac:v0.8.4 environment: - POSTGRES_USER=username - POSTGRES_PASSWORD=password - POSTGRES_DB=postgis - PGUSER=username - PGPASSWORD=password - PGDATABASE=postgis ports: - "${MY_DOCKER_IP:-127.0.0.1}:5439:5432" command: postgres -N 500 # volumes: # - ./.pgdata:/var/lib/postgresql/data networks: default: name: eoapi-network ```

This docker-compose.yml file from eoapi-template, but with an upgraded database image ghcr.io/stac-utils/pgstac:v0.8.4 because v0.7.10 was producing the following error:

Exception: pypgstac version 0.8.4 is not compatible with the target database version 0.7.10. database version 0.7.10.

Note that the loading issue occurs even when only attempting to load the first row of items.json.

vincentsarago commented 4 months ago

@tylere I should update the documentation, but did you unzip the items.json.zip first? the error message is not really clear about this.

tylere commented 4 months ago

Yes, I did unzip the zip file before ingesting. At first I tried to ingest the zip file directly (which would be a nice feature), but I think the error I received was pretty clear.

vincentsarago commented 4 months ago

Well I have no idea what's going on then. It seems the database is 💥 but I can't know why 🤷