Eventual-Inc / Daft

Distributed DataFrame for Python designed for the cloud, powered by Rust
https://getdaft.io
Apache License 2.0
1.76k stars 105 forks source link

[Support] Daft Hudi| Segmentation fault when running Python script on macOS M2 #2217

Closed soumilshah1995 closed 2 weeks ago

soumilshah1995 commented 2 weeks ago

Python 3.9.6 Hudi version 0.14.0

Python code

import daft

db_name="hudidb"
table_name="customers"

path = f"file:///Users/soumilshah/IdeaProjects/SparkProject/tem/{db_name}/{table_name}"

df = daft.read_hudi(path)
df.show()

o/p

(venv) (base) soumilshah@Soumils-MBP SparkProject % python3 sam.py 
WARNING:root:HudiScanOperator(customers) has partitioning keys = [PartitionField(year#Utf8)], but no partition filter was specified. This will result in a full table scan.
╭─────────────────────┬────────────────────┬─────────────┬──────┬────────────────────╮
│ _hoodie_commit_time ┆ _hoodie_commit_seq ┆      …      ┆ year ┆ _hoodie_is_deleted │
│ ---                 ┆ no                 ┆             ┆ ---  ┆ ---                │
│ Utf8                ┆ ---                ┆ (11 hidden) ┆ Utf8 ┆ Boolean            │
│                     ┆ Utf8               ┆             ┆      ┆                    │
╞═════════════════════╪════════════════════╪═════════════╪══════╪════════════════════╡
│ 20240502090527434   ┆ 20240502090527434_ ┆ …           ┆ 2022 ┆ false              │
│                     ┆ 1_0                ┆             ┆      ┆                    │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 20240502090527434   ┆ 20240502090527434_ ┆ …           ┆ 2023 ┆ false              │
│                     ┆ 0_0                ┆             ┆      ┆                    │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 20240502090527434   ┆ 20240502090527434_ ┆ …           ┆ 2020 ┆ false              │
│                     ┆ 2_0                ┆             ┆      ┆                    │
╰─────────────────────┴────────────────────┴─────────────┴──────┴────────────────────╯

(Showing first 3 of 3 rows)
zsh: segmentation fault 

Additional Context: I suspect this issue might be related to memory management or library compatibility on macOS, but I'm not certain.

samster25 commented 1 week ago

@soumilshah1995 did you find the issue?