astanin / python-tabulate

Pretty-print tabular data in Python, a library and a command-line utility. Repository migrated from bitbucket.org/astanin/python-tabulate.
https://pypi.org/project/tabulate/
MIT License
2.1k stars 163 forks source link

Can not tabulate table with pd.NA #239

Open mscanlon-exos opened 1 year ago

mscanlon-exos commented 1 year ago

When moving to 0.9.0, you will see

 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../<my_code>
    return tabulate(
../../../.venv/lib/python3.10/site-packages/tabulate/__init__.py:2048: in tabulate
    list_of_lists, headers = _normalize_tabular_data(
../../../.venv/lib/python3.10/site-packages/tabulate/__init__.py:1471: in _normalize_tabular_data
    rows = list(map(lambda r: r if _is_separating_line(r) else list(r), rows))
../../../.venv/lib/python3.10/site-packages/tabulate/__init__.py:1471: in <lambda>
    rows = list(map(lambda r: r if _is_separating_line(r) else list(r), rows))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

when trying to tabulate a pandas dataframe with pd.NA values in the rows

buhtz commented 1 year ago

It seems that this is my Issue, too. The folks at pandas asked to to report that Issue here.

Reproduce with

df = pandas.DataFrame([[pandas.NA]])
tabulate.tabulate(df)

The full report including version infos can be found here: https://github.com/pandas-dev/pandas/issues/50866

It seems that the error happens with tabulate 0.9 but not with 0.8.10.

ilya112358 commented 1 year ago

The trouble comes from how pandas.NA propagates itself in cases when you normally expect a boolean:

>>> a = (pandas.NA == "")
>>> a
<NA>
>>> b = (True and a)
>>> b
<NA>
>>> if b:
...     pass
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas\_libs\missing.pyx", line 382, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous

There is already a pull request #232 for separating lines which fixes this issue as well:

>>> df = pandas.DataFrame([[pandas.NA]])
>>> print(tabulate(df))
-  ----
0  <NA>
-  ----

Let's hope it's merged before long.