Closed manschoe closed 2 months ago
Thanks for your report.
Can you add the version that still worked?
Previously working version was: 2023.9.1
Dask is converting your strings to arrow backed strings under the hood to improve performance and memory usage. Arrow unfortunately doesn't support lookahead regex expressions, see https://github.com/apache/arrow/issues/40220
You can disable this through
dask.config.set({"dataframe.convert-string": False})
but this will slow you down and increase memory consumption by quite a bit
Closing here since there is nothing we can do on the Dask side
In Dask 2024.2.1 we suddenly have an issue with a regex with a negative lookahead. It somehow is invalid now.
This results in the following error:
Environment: