Open mvashishtha opened 4 days ago
The current pandas behavior matches python stdlib csv
. The reason quoting becomes necessary is to distinguish a 1-element row with the single empty field from an empty row.
In [64]: with StringIO() as f:
...: writer = csv.writer(f)
...: writer.writerow(['col0'])
...: writer.writerow(['a'])
...: writer.writerow([None])
...: f.seek(0)
...: print(f.read())
...:
col0
a
""
Setting writer = csv.writer(f, quoting=csv.QUOTE_NONE)
raises with Error: single empty field record must be quoted
xref: https://github.com/pandas-dev/pandas/issues/18676 xref cpython: https://github.com/python/cpython/issues/76436
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
The
None
in the dataframe becomes an empty string when writing both columns, but""
when writing just the first column.Expected Behavior
Should be consistent and use the representation for missing data in
na_rep
in both cases.Installed Versions
versions
``` INSTALLED VERSIONS ------------------ commit : bdc79c146c2e32f2cab629be240f01658cfb6cc2 python : 3.9.19.final.0 python-bits : 64 OS : Darwin OS-release : 23.5.0 Version : Darwin Kernel Version 23.5.0: Wed May 1 20:14:38 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6020 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8 pandas : 2.2.1 numpy : 1.26.4 pytz : 2024.1 dateutil : 2.8.2 setuptools : 68.2.2 pip : 23.3.1 Cython : None pytest : 7.4.4 hypothesis : None sphinx : 7.3.7 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.3 IPython : 8.12.0 pandas_datareader : None adbc-driver-postgresql: None adbc-driver-sqlite : None bs4 : None bottleneck : None dataframe-api-compat : None fastparquet : None fsspec : 2024.3.1 gcsfs : None matplotlib : 3.8.4 numba : None numexpr : None odfpy : None openpyxl : 3.1.2 pandas_gbq : None pyarrow : 16.0.0 pyreadstat : None python-calamine : None pyxlsb : None s3fs : None scipy : 1.13.0 sqlalchemy : None tables : None tabulate : 0.9.0 xarray : None xlrd : None zstandard : None tzdata : 2024.1 qtpy : None pyqt5 : None ```