Tomas2D / puppeteer-table-parser

Scrape and parse HTML tables with the Puppeteer table parser.
MIT License
21 stars 3 forks source link

Cannot parse header and one column contain DOM elements. #413

Open TimKieu opened 2 weeks ago

TimKieu commented 2 weeks ago

Please help. I am crawling a table which its header and one column contain DOM elements, not plain text as the examples. The error mentions about settings and conversion exception on those.

Tomas2D commented 2 weeks ago

Can I see such HTML Table so I can debug it?

Tomas2D commented 7 hours ago

If your table looks like this

<table id="table-overview">
  <thead>
  <tr>
    <th>A</th>
    <th><input type="checkbox" checked></th>
    <th>C</th>
  </tr>
  </thead>
  <tbody>
  <tr>
    <td>A1</td>
    <td>B1</td>
    <td><img src='#' alt='image'>C1</td>
  </tr>
  <tr>
    <td><a href='#'>A1</a></td>
    <td><input type="checkbox" checked></td>
    <td>C1</td>
  </tr>
  </tbody>
</table>

Then you can do this

    const data = await tableParser(page, {
      selector: '#table-overview',
      asArray: false,
      allowedColNames: {
        'A': 'A',
        '': 'B',
        'C': 'C',
      }
    });