Closed dprothero closed 8 months ago
I would like to work on this.
What are the steps to replicate this? HTML doesn't render to Markdown, that doesn't make sense.
Steps to replicate this:
// initialize turndownService with gfm plugin and default settings
const turndownService = new TurndownService();
turndownService.use(gfm);
// Convert table with caption to markdown
const output = turndownService.turndown(`<table>
<caption>Developer Time (in hours)</caption>
<tbody><tr>
<th>Developer</th>
<th>From Scratch</th>
<th>Carbon</th>
</tr>
<tr>
<th>Developer One</th>
<td>4.2 hours from scratch</td>
<td>1.1 hours using Carbon</td>
</tr>
</tbody></table>`)
result:
| | | |
| --- | --- | --- |Developer Time (in hours)
| Developer | From Scratch | Carbon |
| Developer One | 4.2 hours from scratch | 1.1 hours using Carbon |
some websites with tables that result in broken markdown:
As a combined workaround for this bug and #9885, you can add this code right after turndownService.use(gfm);
:
const tableRule = turndownService.rules.array[2];
if (!tableRule.filter.toString().includes('TABLE'))
throw new Error('Incorrect rule selected. Expected to find table rule');
tableRule.filter = ['table'];
if(tableRule.replacement?.toString().toLowerCase().includes('caption'))
throw new Error('Turndown received caption support - this workaround should be removed');
const originalReplacement = tableRule.replacement;
tableRule.replacement = (content, node, ...rest) => {
const caption = (node as HTMLTableElement).caption?.textContent || '';
const table = originalReplacement?.(content, node, ...rest) ?? '';
return caption === '' ? table : `${caption}\n\n${table.trimStart()}`;
};
turndownService.addRule('caption', {
filter: ['caption', 'colgroup', 'col'],
replacement: () => '',
});
This issue was originally reported here but it seems that repo has been archived, and the code now lives here.
Tables such as
or
Render an empty header, such as
COLGROUP should be ignored, CAPTION should be displayed before the table. Expected output should be: