mmcdermott / EventStreamGPT

Dataset and modelling infrastructure for modelling "event streams": sequences of continuous time, multivariate events with complex internal dependencies.
https://eventstreamml.readthedocs.io/en/latest/
MIT License
98 stars 16 forks source link

'ExprArrayNameSpace' object has no attribute 'eval' #30

Closed Guitaricet closed 1 year ago

Guitaricet commented 1 year ago

The error happens when following the MIMIC-IV tutorial at "Data Extraction & Pre-processing" step.


PYTHONPATH="$EFGPT_PATH:$PYTHONPATH" python \
        $EFGPT_PATH/scripts/build_dataset.py \
        --config-path=$(pwd)/configs \
        --config-name=mimiciv \
        "hydra.searchpath=[$EFGPT_PATH/configs]"

Omitting {'min_los': 3, 'min_admissions': 1} from config!
Empty new events dataframe of type VISIT!
Empty new events dataframe of type ICU_STAY!
Empty new events dataframe of type PROCEDURE!
Error executing job with overrides: []
Traceback (most recent call last):
  File "/mnt/shared_home/vlialin/documents/EventStreamGPT/scripts/build_dataset.py", line 349, in main
    ESD = Dataset(config=config, input_schema=dataset_schema)
  File "/mnt/shared_home/vlialin/documents/EventStreamGPT/EventStream/data/dataset_base.py", line 552, in __init__
    self._validate_and_set_initial_properties(subjects_df, events_df, dynamic_measurements_df)
  File "/mnt/shared_home/vlialin/documents/EventStreamGPT/EventStream/data/dataset_base.py", line 580, in _validate_and_set_initial_properties
    self._agg_by_time()
  File "/mnt/shared_home/vlialin/miniconda3/envs/clinicallm/lib/python3.10/site-packages/mixins/timeable.py", line 101, in wrapper_timing
    out = fn(self, *args, **kwargs)
  File "/mnt/shared_home/vlialin/documents/EventStreamGPT/EventStream/data/dataset_polars.py", line 639, in _agg_by_time
    .arr.eval(pl.col("").cast(pl.Utf8))
AttributeError: 'ExprArrayNameSpace' object has no attribute 'eval'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.```
mmcdermott commented 1 year ago

Thanks @Guitaricet. Just to close the loop on this here as well, this is related to the polars version -- they updated the namespace for nested arrays in 0.18. In https://github.com/mmcdermott/EventStreamGPT/commit/7f002b5642526812efc51361eb5c83f97f97451d I've pushed a change to this to dev.

juancq commented 1 year ago

@mmcdermott this also needs to be changed in the unit tests.