SuperDuperDB / superduperdb

🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
https://superduperdb.com
Apache License 2.0
4.54k stars 444 forks source link

Draft: Implement data observation properties (#2124) #2182

Open klepp0 opened 2 weeks ago

klepp0 commented 2 weeks ago

Description

This pull request is a first draft to get some feedback on it. I'm trying to address #2124, but I'm quite new to the code base and not confident that I used every module as indented. Also some code might be misplaced. So please let me know what you would like to change.

I did implement a DataView class inheriting from DataFrame, to provide functionalities for both examples proposed in #2124. However, the filtering does not feel convenient to me yet. I'd either replace it by a normal DataFrame or pass the filter arguments directly into the select() method before creating the DataFrame. What do you think?

I added a few tests based on the testing on Listener objects. But I haven't fully understood the usage of each Component, the implemented tests could be more robust and I'm also not aware of any edge cases that could arise when dealing with other query types.

I'm happy to receive your feedback.

Related Issues

feat: add DataView on top of DataFrame feat: add data properties to simplify data observation test: data property to simplify data observation

This addresses #2124.

Checklist

Additional Notes or Comments