casact / chainladder-python

Actuarial reserving in Python
https://chainladder-python.readthedocs.io/en/latest/
Mozilla Public License 2.0
192 stars 73 forks source link

Error storing <class 'chainladder.core.triangle.Triangle'> object in pd.DataFrame #142

Open goduckie opened 3 years ago

goduckie commented 3 years ago

Issue as per title., other classes in the package do not cause this error.

Tidiest workaround I've round is to pickle & unpickle the object.

Example:

import chainladder as cl

tri_name = 'clrd'
tri_obj = cl.load_sample(tri_name)
df_fails = pd.DataFrame(data=[[tri_name, tri_obj]], columns=['name', 'cl_triangle'])

#work around
import pickle
pickle_tri = pickle.dumps(tri_obj)
df = pd.DataFrame(data=[[tri_name, pickle_tri]], columns=['name', 'cl_triangle'])
tri_obj = pickle.loads(df.loc[0,'cl_triangle'])
jbogaardt commented 3 years ago

If you want to store the triangle as a dataframe, you can use the to_frame method.

import chainladder as cl

tri_name = 'clrd'
original_tri = cl.load_sample(tri_name)
# store as dataframe
tri_df = tri_obj.to_frame(implicit_axis=True)

# turn dataframe back into triangle
reconstituted_tri = cl.Triangle(
    tri_df.reset_index(), 
    index=original_tri.key_labels, 
    origin='origin',development='valuation', 
    columns=original_tri.columns.to_list())

assert original_tri.sort_index() == reconstituted_tri.sort_index()

This is also in the docs under the Converting to DataFrame section.

goduckie commented 3 years ago

Why is that a better approach than the pickle?

Seems like its more trouble to reconstitute it from a frame as one needs to pull the properties from the frame. The above workaround assumes you have original_tri to pull the index & columns, which if it were the case, one would just use the original_tri in the first place.

Any ideas why it cannot be stored directly? First time I've come across this issue with a class object, wasn't able to find any similar references online.

jbogaardt commented 3 years ago

I guess it depends on what you're trying to do. As a pickle, or an object in a DataFrame cell, the data of the triangle cannot be manipulated. image With the to_frame method, the data is there for further processing.

As to why it fails, it doesn't really.

import chainladder as cl

tri_name = 'clrd'
tri_obj = cl.load_sample(tri_name)
df_fails = pd.DataFrame(data=[[tri_name, tri_obj]], columns=['name', 'cl_triangle'])

df_fails.iloc[0,1] # this prints the triangle just fine

Displaying the total dataframe doesn't work fine. My guess is there is a clash between pandas _repr_html_ and the triangle's _repr_html_.

goduckie commented 3 years ago

Thanks for pointing that out, I hadn't noticed that. Bit quirky, think the pickle has it - at least for now ;)

jbogaardt commented 3 years ago

Thanks, will look into how to make the repr work.