Closed davetapley closed 6 months ago
If you need --hidden-import pyarrow
, you likely have another package that uses pyarrow
and needs to be hooked. What package is that?
@rokm off the top of my head, maybe duckdb
, but it is optional 🤔
Is there a better way to check? 🙏🏻
What does the error traceback look like if you don't add those hidden imports?
File "ng\core\cache\writer.py", line 16, in write_asset
asset.to_parquet(path, index=False)
File "pandas\core\frame.py", line 2973, in to_parquet
File "pandas\io\parquet.py", line 483, in to_parquet
File "pandas\io\parquet.py", line 189, in write
File "pyarrow\\table.pxi", line 3869, in pyarrow.lib.Table.from_pandas
File "pyarrow\pandas_compat.py", line 572, in dataframe_to_arrays
File "pyarrow\pandas_compat.py", line 375, in _get_columns_to_convert
File "pyarrow\\pandas-shim.pxi", line 199, in pyarrow.lib._PandasAPIShim.is_sparse
File "pyarrow\\pandas-shim.pxi", line 200, in pyarrow.lib._PandasAPIShim.is_sparse
File "pyarrow\\pandas-shim.pxi", line 116, in pyarrow.lib._PandasAPIShim._have_pandas_internal
File "pyarrow\\pandas-shim.pxi", line 104, in pyarrow.lib._PandasAPIShim._check_import
File "pyarrow\\pandas-shim.pxi", line 57, in pyarrow.lib._PandasAPIShim._import_pandas
ModuleNotFoundError: No module named 'pyarrow.vendored.version'
[19020] Failed to execute script 'cli_main' due to unhandled exception!
That to_parquet
is from pandas, with logic to use pyarrow
AFAICT here.
Which I guess is why this doesn't find it?
FYI it will become required in pandas upcoming 3.0, if that makes a difference:
Since pandas
hook is in pyinstaller
repo, should I open there instead? 🤔
Hmm, based on the traceback, the following example should reproduce the problem when frozen:
import pandas as pd
df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
df.to_parquet('test.par')
But it seems to work for me - can you test with your environment?
What version of python, PyInstaller
, and pyinstaller-hooks-contrib
are you using?
And also, what version of pyarrow
?
I could repro with that with:
pyinstaller==5.13.1
pyinstaller-hooks-contrib==2023.4
pandas==2.1.0
pyarrow==14.0.1
On Windows 11, if that matters.
Maybe it's time to update at least pyinstaller-hook-contrib
? Your version does not have #662, so it's not surprising that pyarrow.vendored.version
is not collected...
Well that's embarrassing. Sorry for wasting your time 😞
Which library is the hook for?
pyarrow
Have you gotten the library to work with pyinstaller?
Yes, but it needs several hidden imports.
Additional context
I see it's supposed to be support per ⬇️ so I'm not sure if this is a regression and should be fixed there, or all hooks go here now?
I need to
--hidden-import pyarrow
andpyarrow.vendored.version
to get it working.See also ⬇️ which mentions
pyarrow.vendored.version
.