Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
hello hello,
querying population density data (only tested for Germany) works fine, when choosing 'total' as category.
However, when choosing a different category (f.e. 'women'), one can find the downloaded files as .csv in the tmp folder but the code it breaks when creating the parquet files. Error message:
Exception occurred during processing of request from ('127.0.0.1', 33952)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/socketserver.py", line 316, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/local/lib/python3.10/socketserver.py", line 347, in process_request
self.finish_request(request, client_address)
File "/usr/local/lib/python3.10/socketserver.py", line 360, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/local/lib/python3.10/socketserver.py", line 747, in __init__
self.handle()
File "/usr/local/lib/python3.10/site-packages/pyspark/accumulators.py", line 262, in handle
poll(accum_updates)
File "/usr/local/lib/python3.10/site-packages/pyspark/accumulators.py", line 235, in poll
if func():
File "/usr/local/lib/python3.10/site-packages/pyspark/accumulators.py", line 239, in accum_updates
num_updates = read_int(self.rfile)
File "/usr/local/lib/python3.10/site-packages/pyspark/serializers.py", line 564, in read_int
raise EOFError
EOFError
----------------------------------------
ERROR:root:Exception while sending command.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/py4j/clientserver.py", line 480, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/py4j/java_gateway.py", line 1038, in send_command
response = connection.send_command(command)
File "/usr/local/lib/python3.10/site-packages/py4j/clientserver.py", line 503, in send_command
raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while sending or receiving
Traceback (most recent call last):
File "/opt/app/pipelines/population-density/src/main.py", line 21, in <module>
Processor.start(files, output_dir, updated_date)
File "/opt/app/pipelines/population-density/src/Processor.py", line 70, in start
df.write.mode("overwrite").parquet(f"{output_dir}{updated_date}_result.parquet")
File "/usr/local/lib/python3.10/site-packages/pyspark/sql/readwriter.py", line 885, in parquet
self._jwrite.parquet(path)
File "/usr/local/lib/python3.10/site-packages/py4j/java_gateway.py", line 1321, in __call__
return_value = get_return_value(
File "/usr/local/lib/python3.10/site-packages/pyspark/sql/utils.py", line 111, in deco
return f(*a, **kw)
File "/usr/local/lib/python3.10/site-packages/py4j/protocol.py", line 334, in get_return_value
raise Py4JError(
py4j.protocol.Py4JError: An error occurred while calling o84.parquet
ERROR: 1
any idea on how I can fix this? (I work on macOS, Monterey, intel chip and only need the parquet files)
Thank you so much for any help and in general this really awesome project!
hello hello, querying population density data (only tested for Germany) works fine, when choosing 'total' as category. However, when choosing a different category (f.e. 'women'), one can find the downloaded files as .csv in the tmp folder but the code it breaks when creating the parquet files. Error message:
any idea on how I can fix this? (I work on macOS, Monterey, intel chip and only need the parquet files)
Thank you so much for any help and in general this really awesome project!