Open sree1658 opened 2 months ago
To include subheadings in your Markdown table when using the htmltabletomd
library, you need to ensure that the subheadings are correctly formatted and interpreted within the HTML table structure. The library generally converts standard table headers and data, but subheadings may not be recognized unless they are represented correctly in the HTML table.
Hereโs an example of how to structure your HTML table to include subheadings:
<table>
<thead>
<tr>
<th rowspan="2">Main Header 1</th>
<th colspan="2">Sub Header Group</th>
</tr>
<tr>
<th>Sub Header 1</th>
<th>Sub Header 2</th>
</tr>
</thead>
<tbody>
<tr>
<td>Data 1</td>
<td>Data 2</td>
<td>Data 3</td>
</tr>
<tr>
<td>Data 4</td>
<td>Data 5</td>
<td>Data 6</td>
</tr>
</tbody>
</table>
@Mian-Ahmed-Raza Thanks for your help - I Have a similar structure in the website but the sub-headers are captured as null. Attaching the structure below.
@sree1658 The issue might be due to how the
<table border="0" cellpadding="2" cellspacing="0" id="lsd_table" align="center" class="al_center">
<caption>Progress of Sowing of Kharif Crops (As of 23 Sep 2024)</caption>
<thead>
<tr>
<th class="hbg hal hft"> </th>
<th class="hbg hac hft" colspan="3">Actual area ('000 hectare)</th>
<th class="hbg hac hft" colspan="2">% change in actual area sown</th>
</tr>
<tr>
<th class="hbg hal hft"> </th>
<th class="hbg har hft">2022</th>
<th class="hbg har hft">2023</th>
<th class="hbg har hft">2024</th>
<th class="hbg har hft">2023</th>
<th class="hbg har hft">2024</th>
</tr>
</thead>
<tbody>
<tr>
<td class="bg al ft">Agricultural products</td>
<td class="bg ar ft">109,923</td>
<td class="bg ar ft">108,826</td>
<td class="bg ar ft">110,465</td>
<td class="bg ar ft">-1.0</td>
<td class="bg ar ft">1.5</td>
</tr>
</tbody>
</table>
Now, the sub-headers should appear correctly without being captured as null. If you still face issues or need further assistance with this table or any other aspect, let me know!
@Mian-Ahmed-Raza - I am trying to scrap a website so wont be able to change the html format - Thanks for the help anyway - Lesson learnt. ๐๐
@sree1658 no problem dear๐๐
When I am using htmltabletomd.convert_table(html_table) - I could see that Only headers and data are converted to table - sub headers are not considerd in the data and they are printed as null