Closed Lanrzip closed 4 years ago
@ddjhpxs apologies for not getting back to you immediately, I saw your issue post and consequently forgot about it. :grimacing:
I suspect this is due to a long standing issue with Pandas and it's Stata writer, see: https://github.com/TiesdeKok/ipystata/issues/28
However, it has been a while so I am curious to see whether it got solved in the mean time, let me look into it!
Looks like we are in luck, about a month ago with Pandas 1.0 they have incorporated support to write for unicode in the Stata writer (https://github.com/pandas-dev/pandas/blob/4a74463d0244acea98f4fd49182dcf5ea6709f19/doc/source/whatsnew/v1.0.0.rst)
It would be very helpful if you could provide me with a sample file that raises the encoding error, that way I can test whether it gets resolved when Stata 15+ & Pandas 1.0 is used or whether modifications are required to ipystata.
Looks like we are in luck, about a month ago with Pandas 1.0 they have incorporated support to write for unicode in the Stata writer (https://github.com/pandas-dev/pandas/blob/4a74463d0244acea98f4fd49182dcf5ea6709f19/doc/source/whatsnew/v1.0.0.rst)
It would be very helpful if you could provide me with a sample file that raises the encoding error, that way I can test whether it gets resolved when Stata 15+ & Pandas 1.0 is used or whether modifications are required to ipystata.
instance.zip here is the sample code and csv-file, I ran it in Jupiter lab. hope that will be helpful
Thanks, I was able to replicate the problem and solve it using Pandas 1.0!
It required a couple of minor modifications to make everything compatible with UTF-8 encoding, I've uploaded the new iPyStata version (0.4.0) to Github.
Could you try to see whether this also resolves the problem on your end? You can follow these steps to do so:
pd.__version__
in Jupyter Lab.pip uninstall ipystata
pip install git+https://github.com/TiesdeKok/ipystata
One thing that I am uncertain about is the version of Stata that is necessary for this to work. The encoding requires DTA files version 118, which I believe requires Stata 14 or higher.
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-1: ordinal not in range(256)
could you please tell me what is that means? Don't ipystata support the Chinese language?