kuwala-io / kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
https://kuwala.io
Apache License 2.0
788 stars 52 forks source link

OSM-POI: processing step fails #146

Open geoHeil opened 2 years ago

geoHeil commented 2 years ago

To reproduce the problem:

git clone https://github.com/kuwala-io/kuwala.git
cd kuwala
cd pipelines

docker-compose run osm-poi
# download, EU, austria
# download, EU, lithuania
docker-compose run --rm osm-parquetizer java -jar target/osm-parquetizer-1.0.1-SNAPSHOT.jar --continent=eu --country=aut
docker-compose run --rm osm-parquetizer java -jar target/osm-parquetizer-1.0.1-SNAPSHOT.jar --continent=eu --country=ltu

manually running outside of docker (AT) due to memory issues in docker (even though more than enough memory is available to docker daemon)

docker-compose run osm-poi --continent=eu --country=ltu <<< stuck here
docker-compose run osm-poi --continent=eu --country=aut <<< fails with checksum error

# process, eu, aut
# process, eu, ltu

docker-compose run admin-boundaries --continent=eu --country=aut
docker-compose run admin-boundaries --continent=eu --country=ltu

docker-compose run google-trends --continent=europe --country=austria --keyword=Altlerchenfeld
docker-compose run google-trends --continent=europe --country=austria --keyword=Gänserndorf
docker-compose run google-trends --continent=europe --country=austria --keyword=Breitenfeld
docker-compose run google-trends --continent=europe --country=austria --keyword=Josefstadt
docker-compose run google-trends --continent=europe --country=austria --keyword="Innere Stadt"
docker-compose run google-trends --continent=europe --country=austria --keyword="St. Pölten"
docker-compose run google-trends --continent=europe --country=austria --keyword=Josefstadt
docker-compose run google-trends --continent=europe --country=austria --keyword=Aspern
docker-compose run google-trends --continent=europe --country=austria --keyword=Essling
docker-compose run google-trends --continent=europe --country=austria --keyword=HADERSDORF
docker-compose run google-trends --continent=europe --country=austria --keyword=Hütteldorf
docker-compose run google-trends --continent=europe --country=austria --keyword=Landstraße
docker-compose run google-trends --continent=europe --country=austria --keyword=Leopoldstadt
docker-compose run google-trends --continent=europe --country=austria --keyword=Aspern

Problems:

Downloading the PBF and generating the parquets works (partially): 1) there seem to be some memory issues in docker

2) when trying to apply the process step this one fails totally for me.

3) any subsequent steps i.e. google-poi or even ingestion of other steps of OSM are blocked due to (2)