Closed ThoDuyNguyen closed 6 years ago
What is the type of user_name
? I see that it's object
, but if you get the type of a single entry (e.g. type(payment.user_name[0]
), it might be more specific.
Also, what is the type of the index(es)?
In[4]: type(payment.user_name[0])
Out[4]:
str
It worked in the same dataset using the original Pandas syntax
g = payment.sort_values(["user_name", "game_id", "date"]).groupby(["user_name", "game_id"])
payment["paid_time_all"] = g["date"].rank(method="first")
What are types of the indexes? (try something like payment.columns
should give the metadata about the columns.
Also, can you show a complete set up as to how you got to the state you're in? e.g. reading in the data, and any manipulation you did before doing the groupby
?
I will include a reproducible code snippet including sample from database soon.
I found out that with my dataset using column name with "" could save the problem. For example:
first_time_play >> select(X.user_name)
Traceback (most recent call last):
File "<ipython-input-25-7b28bc213d5b>", line 1, in <module>
first_time_play >> select(X.user_name)
File "//anaconda/lib/python2.7/site-packages/dfply/base.py", line 45, in __rrshift__
result = self.function(other_copy)
File "//anaconda/lib/python2.7/site-packages/dfply/base.py", line 52, in <lambda>
return pipe(lambda x: self.function(x, *args, **kwargs))
File "//anaconda/lib/python2.7/site-packages/dfply/base.py", line 179, in __call__
evaluation = self.call_action(args, kwargs)
File "//anaconda/lib/python2.7/site-packages/dfply/base.py", line 285, in call_action
return symbolic.to_callable(symbolic_function)(self.df)
File "//anaconda/lib/python2.7/site-packages/pandas_ply/symbolic.py", line 204, in <lambda>
return lambda *args, **kwargs: obj._eval(dict(enumerate(args), **kwargs))
File "//anaconda/lib/python2.7/site-packages/pandas_ply/symbolic.py", line 142, in _eval
result = evaled_func(*evaled_args, **evaled_kwargs)
File "//anaconda/lib/python2.7/site-packages/dfply/base.py", line 357, in wrapped
return f(*flat_args, **kwargs)
File "//anaconda/lib/python2.7/site-packages/dfply/base.py", line 478, in wrapped
for arg in args[1:]]
File "//anaconda/lib/python2.7/site-packages/dfply/base.py", line 411, in _col_ind_to_position
raise Exception("Column indexer not of type str or int.")
Exception: Column indexer not of type str or int.
But this one worked
first_time_play >> select("user_name")
Out[26]:
user_name
0 00000025
1 00000025
2 NyeinChanThu
3 00001150
4 00001373
5 00001371
6 00000449
7 00000027
Hey - sorry I've been very busy at work and haven't checked this till now.
This is definitely odd. My initial guess would be that its checking it and the name is unicode, so then it fails since I don't have it check for unicode in there. But, you print the type and it seems to be str.
The problem looks like it's happening in the _col_ind_to_label
function.
I will do some debugging of this as soon as I can, but it may be a few days. In the meantime, could you try this on the feature/collapsed-selection
branch? A lot of the internal code has changed in that branch, which I am hoping to make the next generation of this package. I'm interested to see if it is an issue in that one too.
I will try that branch.
Kind regards
Closing this as the issue is for an old version of the package. If this is happening in the new v0.3.x package let me know.
Below is my test data:
When I try to groupby:
My data type:
The data was read from database using
sqlalchemy
anduser_name
is store as varchar.I rechecked with the
diamonds
data using the same command but it works fordiamond
and I could not figure out why.How could I fix the problem?
Kind regards.