plotly / dash

Data Apps & Dashboards for Python. No JavaScript Required.
https://plotly.com/dash
MIT License
21.14k stars 2.04k forks source link

markdown text is not parsed while exporting data in dash table #2644

Open kushalmraut opened 11 months ago

kushalmraut commented 11 months ago

General Description : I am experiencing an issue with the Dash DataTable component when exporting data in CSV format. The problem is that Markdown-formatted text is not being properly parsed during export, and I would like to request a fix for this issue.

dash                 2.8.1
dash-core-components 2.0.0
dash-html-components 2.0.0
dash-renderer        1.9.1
dash-table           5.0.0

Describe the bug

I have created a DataTable with a column containing Markdown-formatted text that includes hyperlinks. When I export the data to CSV using the built-in export feature, the Markdown is not parsed, and the raw Markdown text is exported instead.

Here's the code I'm using:

data = [
    {'column1': 'Text 1', 'column2': '[Link 1](https://example.com)'},
    {'column1': 'Text 2', 'column2': '[Link 2](https://example.org)'},
]

app.layout = html.Div([
    dash_table.DataTable(
        id='datatable',
        columns=[
            {'name': 'Column 1', 'id': 'column1'},
            {'name': 'Column 2', 'id': 'column2', 'type': 'text', 'presentation': 'markdown'},
        ],
        data=data,
        style_data_conditional=[
            {
                'if': {'column_id': 'column2'},
                'textAlign': 'left',
            },
        ],
        style_table={'overflowX': 'auto'},
        style_cell={'textAlign': 'left'},
        export_format="csv",
    ),
])

Actual Behavior: When I export the data to CSV, the Markdown in Column 2 is not parsed, and the raw Markdown text, including the hyperlinks, is exported.

Expected Behavior: I would like the Markdown to be properly parsed during export, and only the text (excluding the Markdown syntax) should be exported. This would result in CSV data that looks like the following:

Screenshots Current UI Display::

image

Current Exported CSV Content:

image

Expected (Requested) CSV Content:

image

Additional Information: I believe that fixing this issue would improve the usability of Dash DataTable's export functionality, especially when Markdown is used for text formatting. Thank you for your attention to this matter, and I hope to see a resolution soon.

alexcjohnson commented 11 months ago

Thanks @kushalmraut - I can see the use of this, but I don't think the current behavior is a bug, rather this is a new feature you're requesting. Rendering the markdown and including only the text content in the output results in a substantial loss of information (in your case, the URL is lost; in other cases the formatting carries important meaning, for example super/subscripts)

I can also imagine this being relevant for numbers and dates, though again generally the raw value is a better default for CSV than the displayed format since the raw value is likely more machine-readable.

An alternate solution - that would be more work for the developer, but also more flexible - would be to specify a list of columns to include in the output. Maybe allow export_columns, which is currently just 'all' or 'visible', to also accept a list of column IDs? Then you could make a hidden column containing exactly the text you want exported, and omit the markdown column from the export.

Finally I'll note that AG Grid (and hence dash-ag-grid) does the same thing we do (with partial flexibility via useValueFormatterForExport):

The same data that is in the grid gets exported, but none of the GUI representation of the data will be. What this means is:

The raw values, and not the result of cell renderer will get used, meaning:

Value Getters will be used. Cell Renderers will NOT be used. Cell Formatters will NOT be used (use processCellCallback instead), unless Use Value Formatter for Export is enabled.

kushalmraut commented 11 months ago

Thank you @alexcjohnson for your response; It would be helpful feature if you could make that change for export_columns and add the ability to pass the list of columns IDs.

Please let me know if you guys will be working on this anytime soon.

pioneerHitesh commented 10 months ago

@alexcjohnson I would like to give this a try , can this be assigned to me?

pioneerHitesh commented 9 months ago

@alexcjohnson I may be able to start work on this in December. Can we wait for that long?

alexcjohnson commented 9 months ago

No problem @pioneerHitesh - this would be nice to have but it's not on our roadmap at the moment so we welcome your contribution whenever you get to it.

pioneerHitesh commented 8 months ago

@alexcjohnson if we want to add the options in export_columns property then we will have to make changes in the plotly.js repo as export_columns does the processing on the js side. An alternate solution would be to create a custom function in python to export the data and use the send_file function. So either we can create a new PR in plotly.js repo or create a custom function exclusively for plotly dash. I want to know your opinion on this.

alexcjohnson commented 8 months ago

@pioneerHitesh yes I think the JS side is the place to do this, but that's still in this repo. For example here's where we interface with the remarkable package, that's ultimately what we use for markdown rendering: https://github.com/plotly/dash/blob/dev/components/dash-table/src/dash-table/utils/Markdown.ts

And here's the React component that renders this inside the table: https://github.com/plotly/dash/blob/dev/components/dash-table/src/dash-table/components/CellMarkdown/index.tsx