thombashi / pytablewriter

pytablewriter is a Python library to write a table in various formats: AsciiDoc / CSV / Elasticsearch / HTML / JavaScript / JSON / LaTeX / LDJSON / LTSV / Markdown / MediaWiki / NumPy / Excel / Pandas / Python / reStructuredText / SQLite / TOML / TSV.
https://pytablewriter.rtfd.io/
MIT License
610 stars 43 forks source link

MarkdownTableWriter outputs ANSI Escape Sequences #30

Closed calebstewart closed 4 years ago

calebstewart commented 4 years ago

Problem

Creating a MarkdownTableWriter and specifying a column style with font_weight="bold" results in the correct **content** being output, but also the entire column data (including the asterisks) are surrounded by ANSI escape sequences for bold text. The report I'm generating is eventually sent through pandoc which renders the ANSI sequences as ugly \xAB characters in a browser.

Expected output

A normal table cell just with ** wrapped on either side of the data. No ANSI escape sequences.

Steps to reproduce

Python 3.8.3 (default, May 17 2020, 18:15:42)
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pytablewriter
>>> writer = pytablewriter.MarkdownTableWriter()
>>> writer.headers = ["Test", "Test"]
>>> writer.value_matrix = [("hello", "world"), ("goodbye", "world")]
>>> writer.column_styles = [pytablewriter.style.Style(font_weight="bold"), pytablewriter.style.Style()]
>>> import sys
>>> writer.dump(sys.stdout, close_after_write=False)
|   Test    |Test |
|-----------|-----|
|**hello**  |world|
|**goodbye**|world|

You can't see the ANSI escape sequences above, obviously, but they are there when I just tested it. I also just installed the most recent version from PyPI, and my Python version is seen above in the output.

edit: forgot to mention, this also happens when writing somewhere that is not stdout. My use-case is actually writing to a file through writer.dump. The example above was just a PoC.

calebstewart commented 4 years ago

Tracing through the code, it seems to be caused by a chain of actions starting in writer/text/_markdown.py. The methods _to_header_item and _to_row_item both call super()._to_*_item respectively:

    def _to_header_item(self, col_dp: ColumnDataProperty, value_dp: DataProperty) -> str:
        return self.__escape_vertical_bar_char(super()._to_header_item(col_dp, value_dp))

    def _to_row_item(self, row_idx: int, col_dp: ColumnDataProperty, value_dp: DataProperty) -> str:
        return self.__escape_vertical_bar_char(super()._to_row_item(row_idx, col_dp, value_dp))

In turn, AbstractTableWriter calls apply_terminal_style:

    def _to_header_item(self, col_dp: ColumnDataProperty, value_dp: DataProperty) -> str:
        format_string = self._get_header_format_string(col_dp, value_dp)
        header = String(value_dp.data).force_convert().strip()
        default_style = self._get_col_style(col_dp.column_index)
        style = self._fetch_style_from_filter(-1, col_dp, value_dp, default_style)

        return self._styler.apply_terminal_style(format_string.format(header), style=style)

The MarkdownStyler does not implement this method, which means it goes to it's parent class TextStyler which appears to add the ansi escape sequences per the selected style:

    def apply_terminal_style(self, value: str, style: Style) -> str:
        if not self._writer.colorize_terminal:
            return value

        ansi_styles = []

        if style.decoration_line in (DecorationLine.STRIKE, DecorationLine.LINE_THROUGH):
            ansi_styles.append("strike")
        if style.decoration_line == DecorationLine.UNDERLINE:
            ansi_styles.append("underline")

        if style.font_weight == FontWeight.BOLD:
            ansi_styles.append("bold")

        return tcolor(value, color=style.color, bg_color=style.bg_color, styles=ansi_styles)

I was curious so I went through the code. Hope this was helpful. :+1:

thombashi commented 4 years ago

@calebstewart Thank you for your detailed report.

I will consider to add an interface that can be disabled ANSI escape from output for the future release.

You can work around the issue by downgrade pytablewriter equals or older than 0.51.0

calebstewart commented 4 years ago

Thanks! Is there anything significant lost by downgrading to 0.51.0? I'm currently running 0.54.0.

thombashi commented 4 years ago

For MarkdownTableWriter class, some styles are not available. Other than that, not so changed significantly.

You can find change logs at https://github.com/thombashi/pytablewriter/releases

thombashi commented 4 years ago

I had released pytablewriter 0.55.0. In this version, dump method will disable ANSI escape sequences during the execution.

You can also disable ANSI escape by manual via enable_ansi_escape attribute of writer classes for other methods like write_table/dumps