Closed jungle-boogie closed 9 years ago
Hi.
There is no limit. My guess is that something funky is going on with your date comparison.
Hi robmc,
I'll test this more and if I have the same results, paste some sample data.
Thanks
Hi,
My guess is that something funky is going on with your date comparison.
I don't think its the date at all...
Look carefully:
cat data.csv | p.df 'df[df.dba=="jungle"]' 'df[df.date=="07/03/2015"]' 'df[df.mid=='2601']' -o table
the digits must be in single quotes.
p.example_data -d tips | p.df 'df[df.sex=="Female"]' 'df[df.smoker=="Yes"]' 'df[df.total_bill=="9.60"]' -o table
Traceback (most recent call last): File "/usr/local/bin/p.df", line 9, in <module> load_entry_point('pandashells==0.1.4', 'console_scripts', 'p.df')() File "/usr/local/lib/python2.7/site-packages/pandashells/bin/p_df.py", line 223, in main df = process_command(args, cmd, df) File "/usr/local/lib/python2.7/site-packages/pandashells/bin/p_df.py", line 112, in process_command df = execute(cmd, scope_entries={'df': df}, retval_name='df') File "/usr/local/lib/python2.7/site-packages/pandashells/bin/p_df.py", line 62, in execute exec(cmd, scope) File "<string>", line 1, in <module> File "/usr/local/lib/python2.7/site-packages/pandas/core/ops.py", line 614, in wrapper res = na_op(values, other) File "/usr/local/lib/python2.7/site-packages/pandas/core/ops.py", line 568, in na_op raise TypeError("invalid type comparison") TypeError: invalid type comparison
p.example_data -d tips | p.df 'df[df.sex=="Female"]' 'df[df.smoker=="Yes"]' 'df[df.total_bill=='9.60']' -o table
total_bill tip sex smoker day time size
9.6 4 Female Yes Sun Dinner 2
So it does not matter if this is a whole number or some decimal number.
And with my real data and using a single quote, I get the desired results.
Thanks
This statement does not work because 9.60 is a floating point number, and by enclosing it it double quotes, you are asking for a string.
p.example_data -d tips | p.df 'df[df.sex=="Female"]' 'df[df.smoker=="Yes"]' 'df[df.total_bill=="9.60"]' -o table
This statement works, but it does so by accident.
p.example_data -d tips | p.df 'df[df.sex=="Female"]' 'df[df.smoker=="Yes"]' 'df[df.total_bill=='9.60']' -o table
You are essentially concatinating the strings
'df[df.total_bill=='
and '9.60]'
which evaluates to a valid expression.
The more appropriate way of doing this would be
p.example_data -d tips | p.df 'df[df.sex=="Female"]' 'df[df.smoker=="Yes"]' 'df[df.total_bill==9.60]' -o table
Bash can get a little tricky with the way it uses quotes. I'd recommend looking into that. If this helps, the following two statements are equivalent.
p.example_data -d tips | p.df 'df[df.sex=="Female"]'
p.example_data -d tips | p.df "df[df.sex=='Female']"
However, the second one can get you into trouble because bash can do string interpolation on you when you don't want it.
Hi robdmc,
This statement does not work because 9.60 is a floating point number,
Yes, I had thought it was related to the floating point numbers.
Thank you for your detailed reply and your expert analysis on the proper way to handle floating point numbers.
I don't use bash but I'll keep your advice in mind.
Thank you for writing pandashells, I look forward to the updates you may make to it and my time saving uses because of it.
Hello,
Is there a limit of two (2) for the 'select by row' with the df.h function?
p.df -h shows:
* Select by row p.example_data -d tips \ | p.df 'df[df.sex=="Female"]' 'df[df.smoker=="Yes"]' -o table
I can do two just fine, but with three (3), it says:
Traceback (most recent call last): File "/usr/local/bin/p.df", line 9, in <module> load_entry_point('pandashells==0.1.4', 'console_scripts', 'p.df')() File "/usr/local/lib/python2.7/site-packages/pandashells/bin/p_df.py", line 223, in main df = process_command(args, cmd, df) File "/usr/local/lib/python2.7/site-packages/pandashells/bin/p_df.py", line 112, in process_command df = execute(cmd, scope_entries={'df': df}, retval_name='df') File "/usr/local/lib/python2.7/site-packages/pandashells/bin/p_df.py", line 62, in execute exec(cmd, scope) File "<string>", line 1, in <module> File "/usr/local/lib/python2.7/site-packages/pandas/core/ops.py", line 614, in wrapper res = na_op(values, other) File "/usr/local/lib/python2.7/site-packages/pandas/core/ops.py", line 568, in na_op raise TypeError("invalid type comparison") TypeError: invalid type comparison
My cli input:
cat data.csv | p.df 'df[df.mid=="2600"]' 'df[df.date=="07/02/2015"]' 'df[df.type=="Credit Card Authorize"]' -o table
Is there something obvious that I'm doing wrong here?
Using:
pandashells (0.1.4)