Open Moondra opened 7 years ago
Can you make a reproducible example? df
isn't defined.
Here is a way to reproduce this issue:
download pickled DF
fn = r'D:\download\Small_cap_bio_DF'
df = pd.read_pickle(fn)
df.loc[df['Market Cap'] =='N/A', 'Market Cap'] = '-1'
the following works when a DF is splitted into four parts:
[x[pd.eval(x['Market Cap'].replace(['[Kk]','[Mm]','[Bb]'],['*10**3','*10**6','*10**9'], regex=True).add(' < 35*10**6'))]
for x in np.split(df,4)]
if we split it into two parts:
[x[pd.eval(x['Market Cap'].replace(['[Kk]','[Mm]','[Bb]'],['*10**3','*10**6','*10**9'], regex=True).add(' < 35*10**6'))]
for x in np.split(df,2)]
it produces:
AttributeError: 'PandasExprVisitor' object has no attribute 'visit_Ellipsis'
Simple repo:
In [93]: s = pd.Series(['1 == 1', '2 == 1'] * 1000)
In [94]: pd.eval(s.head())
Out[94]: array([True, False, True, False, True], dtype=object)
In [95]: pd.eval(s)
AttributeError: 'PandasExprVisitor' object has no attribute 'visit_Ellipsis'
This is a sort of a mis-use of eval, I was surprised it worked at all - apparently the Series
repr is being picked up by eval, and with a long Series
, the trunctation characters (...
) is parsed as Ellipsis
.
@chris-b1,
it's a good reproducible example, thank you!
the limit seems to be 100 rows:
this works:
pd.eval(s.head(100))
the following produces mentioned above error:
pd.eval(s.head(101))
We don't want to support passing pandas objects to eval
right? It takes a string, not a Series.
When trying to run this code, I'm getting the above error. I worked with Max U from StackOverflow in a private chat and he concluded it was a bug.
Here is the stackoverflow link: http://stackoverflow.com/questions/43838557/custom-boolean-filtering-in-pandas
Here is my gitub which contains the data if you want to reproduce the error:
https://github.com/Moondra/Logistic-Regression-
The data can be found as a pickle under the label 'Small_cap_bio_DF' Just be sure to use the line
df['Market Cap'][df['Market Cap'] =='N/A'] = '-1'
to remove theN/A
values.The line producing the error is
Problem description
Instead of filtering the dataframe in relation to its Marketcap, I'm getting an AttributeError.
Expected Output
Filtering all rows whose Marketcap value is less than 30M.
Output of
pd.show_versions()