ToucanToco / fastexcel

A Python wrapper around calamine
http://fastexcel.toucantoco.dev/
MIT License
102 stars 6 forks source link

Add support for `Windows` #154

Closed alexander-beedie closed 8 months ago

alexander-beedie commented 8 months ago

@lukapeschke: Seems fastexcel is currently only available for Linux/MacOS. Would be great to get a Windows build out too 👍

ritchie46 commented 8 months ago

The CI builds in polars can help set that up. https://github.com/pola-rs/polars/blob/main/.github/workflows/release-python.yml

lukapeschke commented 8 months ago

I'll look into it, thanks for the hints :slightly_smiling_face:

lukapeschke commented 8 months ago

@alexander-beedie I just released v0.8.0, and the windows wheels are available on PyPI. Could you please check if everything works as expected for you ? I unfortunately don't have access to a windows machine right now :grimacing:

alexander-beedie commented 8 months ago

@alexander-beedie I just released v0.8.0, and the windows wheels are available on PyPI. Could you please check if everything works as expected for you ? I unfortunately don't have access to a windows machine right now 😬

Me neither - but our CI does, so let's see if it goes green ;) ➡️ https://github.com/pola-rs/polars/pull/14171


@lukapeschke: update - it did not 😅

thread '<unnamed>' panicked at src\lib.rs:11:41:
called `Result::unwrap()` on an `Err` value: 
    Could not open workbook at C:\Users\RUNNER~1\AppData\Local\Temp\tmpz3e5atf0

Caused by:
    Cannot detect file format

My initial guess is that the file-reading code doesn't understand the 8.3 file path (eg: the "~" may need to be expanded?) Unfortunately I also don't have access to a Windows box to confirm if that's it or not...

lukapeschke commented 8 months ago

@alexander-beedie Argh! I think the message comes from calamine, when it tries to infer the file type without knowing its extension: https://github.com/tafia/calamine/blob/v0.22.1/src/auto.rs#L40-L52

We unfortunately don't have much context, but my guess for the issue would be:

So I believe that a fix for that would be to close the file before its name is passed to read_excel. According to https://github.com/python/cpython/issues/58451 and the 3.12 docs This can be done by passing delete_on_close=False (which would cause the file to be closed when the object is deleted).

However, for earlier versions, I fear that you won't have a choice but to pass delete=False, and manually close and delete the file in a try/finally block :confused:

Or you could wait for #162 to be implemented and directly pass source.getvalue() to read_excel once it's done

alexander-beedie commented 8 months ago

@alexander-beedie Argh! I think the message comes from calamine, when it tries to infer the file type without knowing its extension: https://github.com/tafia/calamine/blob/v0.22.1/src/auto.rs#L40-L52

Perhaps, but the same code works on Linux/Mac - it isn't a Windows specific code path 🤔

lukapeschke commented 8 months ago

@alexander-beedie Yes, but I believe that the path fails on windows because the NamedTemporaryFile cannot be opened a second time. Thus, every open_workbook<X> branch fails, not because the format is invalid, but because the file cannot be opened.

Closing the tempfile before passing it to read_excel and cleaning it up manually afterwards seems to work as expected :slightly_smiling_face: https://github.com/pola-rs/polars/pull/14183

alexander-beedie commented 8 months ago

@lukapeschke: I think this issue can be closed, as 0.8.0 looks good! (Polars unit tests now run on all platforms successfully: https://github.com/pola-rs/polars/pull/14171).

lukapeschke commented 8 months ago

Great news :tada: