machow / siuba

Python library for using dplyr like syntax with pandas and SQL
https://siuba.org
MIT License
1.14k stars 48 forks source link

The name `filter` shadows a builtin #464

Closed nathanjmcdougall closed 1 year ago

nathanjmcdougall commented 1 year ago

Since filter is a builtin, my linter (pylint) warns me about overriding it with the alternative implementation from siuba. I am currently using

from siuba import filter as filtr

As a workaround. However, this is a little bit clumsy and I was wondering whether there would be support for a built-in alias for filter (such as filtr) which would avoid this issue: so I could do something like

from siuba import _, filtr, mutate
machow commented 1 year ago

Hey, thanks for raising that pylint complains about overriding builtins. It's a bit of a funky situation, since pylint also discourages people from using the filter function (see this SO post).

I wonder if it's easier to disable the specific pylint warning about overriding builtins (mentioned in that SO post)? WDYT?

Alternatively, you could import siuba as sb, and use that...

import siuba as sb
from siuba import _, mutate

sb.filter(...)

This would be similar to how numpy exposes a sum() function, which is also a builtin:

import numpy as np
from numpy import mean

np.sum(...)
nathanjmcdougall commented 1 year ago

I don't really want to disable that warning across the board, since it can be helpful to know whether a built-in is overridden in some cases. I don't think pylint allows you to suppress the warning on a function-by-function basis though.

So I'd prefer sb.filter. And I think it's a good point regard numpy with sum, I guess in that case numpy's implementation is very similar the the builtin, so there's less likely to be confusion in practice.

I do worry that sb.filter could be unnecessarily verbose, and it would stick out a bit if you're using simply mutate and select without the sb.. That's why I am still inclined toward an alias, but I understand if you'd be reluctant to support that.

nathanjmcdougall commented 1 year ago

Pylint has a configuration setting which I think will work for my purposes:

[tool.pylint.variables]
redefining-builtins-modules = ["siuba"]

Perhaps this would be worth documenting somewhere?