ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
5.33k stars 599 forks source link

feat(datafusion): add map methods to datafusion compiler #10510

Open venkat-oss opened 3 days ago

venkat-oss commented 3 days ago

ops.Map - ✅ (Already supported by ibis) ops.MapLength - ✅ ops.MapGet - ✅ ops.MapContains - ✅ ops.MapKeys - ✅ ops.MapValues - ✅ ops.MapMerge - 🆘 Need help

venkat-oss commented 3 days ago

@gforsyth and @cpcloud could you please provide some feedback/pointers on how MapMerge could be implemented?

I'm not very sure how MapMerge could be implemented unless supported by datafusion.

IndexSeek commented 2 days ago

Very nice! Thank you for working on this, @venkat-oss.

Could you remove the "notyet" markers in the tests so that we can ensure these operations run properly with DataFusion? Here is an example of a marker and where they can be found in the case of the tests for map:

https://github.com/ibis-project/ibis/blob/db8af10a30fb204dd1dff25134e88a5d7433f4e0/ibis/backends/tests/test_map.py#L657-L660

After this, can you run your code through the linter? If you have just, you can do: just fmt. This command will run the following (in case you don't have just available):

    ruff format .
    ruff check --fix .