jmbejara / comp-econ-sp18

Main Course Repository for Computational Methods in Economics (Econ 21410, Spring 2018)
16 stars 23 forks source link

Behavior of np.sum on pandas dataframes #59

Closed Jacob-Bishop closed 6 years ago

Jacob-Bishop commented 6 years ago

It looks like:

test = pd.DataFrame(np.array([[1,1,1,1],[3,3,3,3]])) print(np.shape(test)) np.sum(test)

produces a 1x4 dataframe instead of the 1x1 you'd expect from the numpy documentation. Why is this? Is there any way to have it sum over all axes instead?

jmbejara commented 6 years ago

I don't know the exactly why, but my guess is that it is because test is not a numpy array. From what I recall, you can think of a DataFrame as a dictionary of individual Series objects. However, I really don't know the internals of how this work---how do they write this code so that it returns a Series object with the column index preserved? Maybe numpy is pandas-aware? Kinda cool, but I don't know.