Closed michael-simons closed 9 months ago
Hmm, that seems to be flaky… Sometimes it works on a fresh instance, sometimes it does not.
Ok, this is the error message when the file already exists… It is stated in the docs that the file must not exists, but you would never assume that to be the cause reading the message.
Changed the title of this issue.
Hi! Thanks for filing this issue. Seems like this is an underlying issue with the GDAL driver, or perhaps with how it interacts with our filesystem abstraction. It may very well be possible that we can improve this error message somehow.
So I have spent quite some time looking into this. The problem is that when DuckDB detects that an output file already exists it by default writer the output to a separate file suffixed with .tmp
before doing a swap at the end. This is to avoid wrecking the output file if the write would fail halfway.
The problem is that some GDAL drivers (like xlsx) checks that the file they want to write ends with a their extension before they start writing anything as an extra failsafe. But this doesn't work when writing the tempfile as that changes the extension from e.g. .xlsx
to .xlsx.tmp
, hence the error.
I think the best solution here is to change it in DuckDB core so that we write temp-files with the original extension, but this can get a little bit hairy as we need to do quite a bit of path manipulation and detection. But I'll look into it and reference this issue once I got a PR up.
Thanks for the detailed and understandable write-up. I can relate and agree.