Closed LucianU closed 2 years ago
It looks like Issue #15 is back. It's a problem with Patsy, so I don't have an easy way to fix it. Encoding the formula as ascii seemed like it solved the problem, but apparently not.
Since I can't fix it, I added an error message: https://github.com/AllenDowney/ThinkStats2/commit/ca7e911a1aa103b6560661ebf8bd2cc3e6ec76d7
Workarounds: 1) Use Python 2 for this example. 2) Skip this example.
Sorry!
It seems the same problem happens with Python2 as well. I get the stack trace encountered by @LucianU. The relevant output of my Pandas environment from pd.show_versions(as_json=False)
is
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 13.4.0
machine: x86_64
pandas: 0.17.0
Cython: 0.22
numpy: 1.10.1
scipy: 0.15.1
statsmodels: 0.6.1
patsy: 0.3.0
I was able to get the sample to work if I encoded the formula as suggested in #15.
formula = ('totalwgt_lb ~ agepeg + ' + name).encode('ascii')
I can confirm that I'm using Python 2.
Right, it looks like we need to encode the formula for both Python 2 and 3.
But in 3 it looks like it doesn't work even with the encode.
So the code in regression.py is the best I can do for now.
The example in the book doesn't include the encode step. I can add it, but I am not sure whether it will decrease the net level of confusion. Thinking...
And does the encoding suggested by Paul Glezen work for you, too?
On Wed, Nov 25, 2015 at 5:37 AM, Lucian Ursu notifications@github.com wrote:
I can confirm that I'm using Python 2.
— Reply to this email directly or view it on GitHub https://github.com/AllenDowney/ThinkStats2/issues/31#issuecomment-159566493 .
@AllenDowney, yes it does. I think it's worth adding it the book and specifying that it's needed because of an issue in patsy
.
If I understand the issues:
1) In Python 2, the code in regression.py works because it encodes the patsy formula as ascii. But the code in the book omits this line, so if someone tries to run the code directly from the book, they're going to get a confusing message. I am not sure whether adding this to the book will increase or decrease the total amount of confusion.
2) In Python 3, it seems, the code in regression.py doesn't work despite the fact that it encodes the formula in ascii. It doesn't look like I can fix this.
On Wed, Dec 2, 2015 at 2:57 AM, Lucian Ursu notifications@github.com wrote:
@AllenDowney https://github.com/AllenDowney, yes it does.
— Reply to this email directly or view it on GitHub https://github.com/AllenDowney/ThinkStats2/issues/31#issuecomment-161213282 .
Hi, For what it's worth, I managed to run the code in python 3 by commenting the line that sets the encoding.
Thanks @FlorianGD, the commenting worked for me too (in Python 3).
I'm getting an error when running the code in this section. Here's the shell session: