Open PeterFogh opened 1 year ago
My script loads some 1GB CSV files using dask and write the data to parquet. However, sometimes the dask job failes with aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socketcaused by fsspec.exceptions.FSTimeoutError
aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socket
fsspec.exceptions.FSTimeoutError
STORAGE_OPTIONS={'account_name': '8200datalakestdev', 'anon': False} ddf_heat_randers = ddf.read_csv( input_path, storage_options=STORAGE_OPTIONS, sep=';') ddf_data.to_parquet(output_path, storage_options=STORAGE_OPTIONS)
Is there a way to configure the timeout?
I have tried with STORAGE_OPTIONS={'account_name': '8200datalakestdev', 'anon': False, 'timeout':1} without any changes.
STORAGE_OPTIONS={'account_name': '8200datalakestdev', 'anon': False, 'timeout':1}
Perhaps my question relates to the PR: https://github.com/fsspec/adlfs/pull/364
Maybe https://github.com/fsspec/adlfs/pull/430 would help?
My script loads some 1GB CSV files using dask and write the data to parquet. However, sometimes the dask job failes with
aiohttp.client_exceptions.ServerTimeoutError: Timeout on reading data from socket
caused byfsspec.exceptions.FSTimeoutError
Is there a way to configure the timeout?
I have tried with
STORAGE_OPTIONS={'account_name': '8200datalakestdev', 'anon': False, 'timeout':1}
without any changes.Perhaps my question relates to the PR: https://github.com/fsspec/adlfs/pull/364