activeloopai / deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
https://activeloop.ai
Mozilla Public License 2.0
7.88k stars 607 forks source link

Check function passed to dataset.filter() is correctly configured #2812

Closed nvoxland-al closed 3 months ago

nvoxland-al commented 3 months ago

🚀 🚀 Pull Request

Impact

Description

When defining a filter like this:

@deeplake.compute
def filter_fn(sample):
    return sample.t1.data()["value"] == "Hello"

view = ds.filter(filter_fn())

if you forget the ()in view = ds.filter(filter_fn()) your data is not filtered.

This PR looks for that pattern and auto-calls the function for you.

Things to be aware of

It is still valid to call:

def filter_fn(sample):
    return sample.t1.data()["value"] == "Hello"

view = ds.filter(filter_fn)

which doesn't annotate filter_fn and therefore should NOT have () on the value passed to ds.filter()

Additional Context

sonarcloud[bot] commented 3 months ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
50.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 50.00000% with 1 lines in your changes are missing coverage. Please review. Files Patch % Lines
deeplake/core/query/filter.py 50.00% 1 Missing :warning:

:loudspeaker: Thoughts on this report? Let us know!