Roche / pyreadstat

Python package to read sas, spss and stata files into pandas data frames. It is a wrapper for the C library readstat.
Other
327 stars 61 forks source link

write_sav: uninformative error message for empty column names #276

Open MarkPaulin opened 5 days ago

MarkPaulin commented 5 days ago

Describe the issue

write_sav() doesn't give a particularly informative error message when trying to write a data frame with an empty string as a column name. If a column has a column named None, you get a much clearer error message.

To Reproduce

import pandas as pd
import pyreadstat

df1 = pd.DataFrame({"a": [1, 2], "": [3, 4]})
pyreadstat.write_sav(df1, "file.sav")
#> Traceback (most recent call last):
#>   File "<stdin>", line 1, in <module>
#>   File "pyreadstat\\pyreadstat.pyx", line 772, in pyreadstat.pyreadstat.write_sav
#>   File "pyreadstat\\_readstat_writer.pyx", line 598, in pyreadstat._readstat_writer.run_write
#> IndexError: string index out of range

df2 = pd.DataFrame({"a": [1, 2], None: [3, 4]})
pyreadstat.write_sav(df2, "file.sav")
#> Traceback (most recent call last):
#>   File "<stdin>", line 1, in <module>
#>   File "pyreadstat\\pyreadstat.pyx", line 772, in pyreadstat.pyreadstat.write_sav
#>   File "pyreadstat\\_readstat_writer.pyx", line 597, in pyreadstat._readstat_writer.run_write
#> pyreadstat._readstat_parser.PyreadstatError: variable name 'None' is of type <class 'NoneType'> and it must be str (not starting with numbers!)

I'm happy to submit a pull request to add an error for empty strings in line with Nones.

ofajardo commented 5 days ago

Good catch! Please go ahead with the PR!