dodger487 / dplython

dplyr for python
MIT License
763 stars 52 forks source link

Remove dataframe copying on each >> operation #51

Open dodger487 opened 8 years ago

dodger487 commented 8 years ago

Currently, dplython copies a new DataFrame whenever >> is used. The goal of this is to prevent dplython from inadvertently altering the contents of the original DataFrame when executing operations. See this pandas reference: http://pandas.pydata.org/pandas-docs/stable/indexing.html#returning-a-view-versus-a-copy

It would be great if we could restrict this behavior, or push it to key verbs (such as mutate), as it's very inefficient on large data sets.