intro-stat-learning / ISLP_labs

Up-to-date version of labs for ISLP
BSD 2-Clause "Simplified" License
734 stars 427 forks source link

Chapter 3 Linear Regression fails on design.transform() not re-initialized #27

Open linusjf opened 2 months ago

linusjf commented 2 months ago

In Chapter 3, Linear Regression lab,


new_df = pd.DataFrame({'lstat':[5, 10, 15]})
newX = design.transform(new_df)
newX

The above snippet throws the following errors:

---------------------------------------------------------------------------
SpecificationError                        Traceback (most recent call last)
Cell In[20], line 3
      1 new_df = pd.DataFrame({"lstat": [5,10,15]})
      2 print(new_df)
----> 3 newX = design.transform(new_df)
      4 newX

File [~/ISLP/islpenv/lib/python3.10/site-packages/pandas/core/frame.py:10166](http://localhost:8888/lab/tree/Chapter3/islpenv/lib/python3.10/site-packages/pandas/core/frame.py#line=10165), in DataFrame.transform(self, func, axis, *args, **kwargs)
  10163 from pandas.core.apply import frame_apply
  10165 op = frame_apply(self, func=func, axis=axis, args=args, kwargs=kwargs)
> 10166 result = op.transform()
  10167 assert isinstance(result, DataFrame)
  10168 return result

File [~/ISLP/islpenv/lib/python3.10/site-packages/pandas/core/apply.py:241](http://localhost:8888/lab/tree/Chapter3/islpenv/lib/python3.10/site-packages/pandas/core/apply.py#line=240), in Apply.transform(self)
    239 if is_dict_like(func):
    240     func = cast(AggFuncTypeDict, func)
--> 241     return self.transform_dict_like(func)
    243 # func is either str or callable
    244 func = cast(AggFuncTypeBase, func)

File [~/ISLP/islpenv/lib/python3.10/site-packages/pandas/core/apply.py:287](http://localhost:8888/lab/tree/Chapter3/islpenv/lib/python3.10/site-packages/pandas/core/apply.py#line=286), in Apply.transform_dict_like(self, func)
    284 if len(func) == 0:
    285     raise ValueError("No transform functions were provided")
--> 287 func = self.normalize_dictlike_arg("transform", obj, func)
    289 results: dict[Hashable, DataFrame | Series] = {}
    290 for name, how in func.items():

File [~/ISLP/islpenv/lib/python3.10/site-packages/pandas/core/apply.py:655](http://localhost:8888/lab/tree/Chapter3/islpenv/lib/python3.10/site-packages/pandas/core/apply.py#line=654), in Apply.normalize_dictlike_arg(self, how, obj, func)
    648 # Can't use func.values(); wouldn't work for a Series
    649 if (
    650     how == "agg"
    651     and isinstance(obj, ABCSeries)
    652     and any(is_list_like(v) for _, v in func.items())
    653 ) or (any(is_dict_like(v) for _, v in func.items())):
    654     # GH 15931 - deprecation of renaming keys
--> 655     raise SpecificationError("nested renamer is not supported")
    657 if obj.ndim != 1:
    658     # Check for missing columns on a frame
    659     from pandas import Index

SpecificationError: nested renamer is not supported

What's causing the blowup?

linusjf commented 2 months ago

Re-initialising design as follows before the above invocations resolves the issue. That's missing in the lab code:

design = MS(["lstat"])

Also, change design.transform(new_df) to

newX = design.fit_transform(new_df)
jonathan-taylor commented 2 months ago

Will take a look and see if I can recreate this exception. Have not seen this error before and the exception does not refer to any line in ISLP as far as I can see

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Linus Fernandes @.> Sent: Wednesday, August 28, 2024 9:14:01 AM To: intro-stat-learning/ISLP_labs @.> Cc: Subscribed @.***> Subject: Re: [intro-stat-learning/ISLP_labs] ISLP Labs for Linear Regression fails (Issue #27)

Re-initialising design as follows before the above invocations resolves the issue. That's missing in the lab text:

design = MS(["lstat"])

— Reply to this email directly, view it on GitHubhttps://github.com/intro-stat-learning/ISLP_labs/issues/27#issuecomment-2315768134, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AACTM25OVDYELC6LM2U6VJ3ZTXZMTAVCNFSM6AAAAABNITNCA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJVG43DQMJTGQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>

linusjf commented 2 months ago

https://github.com/intro-stat-learning/ISLP_labs/blob/main/Ch03-linreg-lab.ipynb?short_path=2720fcd

Line 1251

My working changes can be found here:

https://github.com/linusjf/ISLP/blob/main/Chapter3/Labs.py