Open sujaybanerjee opened 3 months ago
Hi @sujaybanerjee -- this is a polars version issue. The main branch of ESGPT is only guaranteed with polars up to 0.18.15, as is specified in the pyproject.toml
file. Can you try this on the dev branch, which supports a much more recent version of polars?
In this section, when I run this code block, I get an error.
import subprocess
command = """\ PYTHONPATH=$(pwd):$PYTHONPATH ./scripts/build_dataset.py \ --config-path="$(pwd)/sample_data/" \ --config-name=dataset \ "hydra.searchpath=[$(pwd)/configs]" """
command_out = subprocess.run(command, shell=True, capture_output=True) print(command_out.stdout.decode())
if command_out.returncode == 1: print("Command Errored!")
print(command_out.stderr.decode())
Here is the error message I get:
“$ PYTHONPATH=$(pwd):$PYTHONPATH python3 ./scripts/build_dataset.py --config-path="$(pwd)/sample_data/" --config-name=dataset "hydra.searchpath=[$(pwd)/configs]" Error executing job with overrides: [] Traceback (most recent call last): File "/home/user/EventStreamGPT/./scripts/build_dataset.py", line 364, in main ESD = Dataset(config=config, input_schema=dataset_schema) File "/home/user/EventStreamGPT/EventStream/data/dataset_base.py", line 550, in init events_df, dynamic_measurements_df = self.build_event_and_measurement_dfs( File "/home/user/EventStreamGPT/EventStream/data/dataset_base.py", line 259, in build_event_and_measurement_dfs cls._process_events_and_measurements_df( File "/home/user/EventStreamGPT/EventStream/data/dataset_polars.py", line 356, in _process_events_and_measurements_df if len(df.columns) > 4: File "/home/user/.local/lib/python3.10/site-packages/polars/lazyframe/frame.py", line 411, in columns return self._ldf.columns() polars.exceptions.ComputeError: failed to determine supertype of cat and i64
This error occurred with the following context stack: [1] 'select' failed [2] 'with_columns' input failed to resolve [3] 'drop' input failed to resolve [4] 'with_columns' input failed to resolve [5] 'drop' input failed to resolve [6] 'filter' input failed to resolve [7] 'filter' input failed to resolve [8] 'with_columns' input failed to resolve [9] 'drop' input failed to resolve [10] 'filter' input failed to resolve [11] 'select' input failed to resolve [12] 'unique' input failed to resolve [13] 'with row index' input failed to resolve
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.”
I am using Python 3.10.12 and polar 0.20.26. I was wondering how to fix this.