lightswitch05 / table-to-json

Serializes HTML tables into JSON objects.
http://lightswitch05.github.io/table-to-json/
MIT License
754 stars 172 forks source link

<th> tags are missing #52

Open onsa opened 4 years ago

onsa commented 4 years ago

I tried parsing the table below copied from Wiktionary (https://en.wiktionary.org/wiki/by%C4%87) via the Plunker Template. The preview looked correct but once I hit the Convert! button, the alert window displayed only [].

<table class="wikitable inflection-table" style="margin: 1em auto;">
    <tbody>
        <tr>
            <th rowspan="2"></th>
            <th> 
            </th>
            <th colspan="3"><a href="/wiki/singular" title="singular">singular</a></th>
            <th colspan="2"><a href="/wiki/plural" title="plural">plural</a></th>
        </tr>
        <tr>
            <th><a href="/wiki/grammatical_person" title="grammatical person">person</a></th>
            <th><span class="gender"><abbr title="masculine gender">m</abbr></span></th>
            <th><span class="gender"><abbr title="feminine gender">f</abbr></span></th>
            <th><span class="gender"><abbr title="neuter gender">n</abbr></span></th>
            <th><span class="gender"><abbr title="masculine gender">m</abbr> <abbr title="personal">pers</abbr></span>
            </th>
            <th>other
            </th>
        </tr>
        <tr>
            <th colspan="2"><a href="/wiki/infinitive" title="infinitive">infinitive</a></th>
            <td colspan="5"><span class="Latn" lang="pl"><strong class="selflink">być</strong></span></td>
        </tr>
        <tr>
            <th rowspan="3"><a href="/wiki/present_tense" title="present tense">present indicative</a></th>
            <th><a href="/wiki/first_person" title="first person">1st</a></th>
            <td colspan="3"><span class="Latn" lang="pl"><a href="/wiki/jestem#Polish" title="jestem">jestem</a></span>,
                <span class="Latn" lang="pl"><a href="/wiki/-m#Polish" title="-m">-m</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/jeste%C5%9Bmy#Polish"
                        title="jesteśmy">jesteśmy</a></span>, <span class="Latn" lang="pl"><a
                        href="/wiki/-%C5%9Bmy#Polish" title="-śmy">-śmy</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/second_person" title="second person">2nd</a></th>
            <td colspan="3"><span class="Latn" lang="pl"><a href="/wiki/jeste%C5%9B#Polish"
                        title="jesteś">jesteś</a></span>, <span class="Latn" lang="pl"><a href="/wiki/-%C5%9B#Polish"
                        title="-ś">-ś</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/jeste%C5%9Bcie#Polish"
                        title="jesteście">jesteście</a></span>, <span class="Latn" lang="pl"><a
                        href="/wiki/-%C5%9Bcie#Polish" title="-ście">-ście</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/third_person" title="third person">3rd</a></th>
            <td colspan="3"><span class="Latn" lang="pl"><a href="/wiki/jest#Polish" title="jest">jest</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/s%C4%85#Polish" title="są">są</a></span></td>
        </tr>
        <tr>
            <th rowspan="3"><a href="/wiki/past_tense" title="past tense">past indicative</a></th>
            <th><a href="/wiki/first_person" title="first person">1st</a></th>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82em#Polish" title="byłem">byłem</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82am#Polish" title="byłam">byłam</a></span></td>
            <td> 
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/byli%C5%9Bmy#Polish" title="byliśmy">byliśmy</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82y%C5%9Bmy#Polish" title="byłyśmy">byłyśmy</a></span>
            </td>
        </tr>
        <tr>
            <th><a href="/wiki/second_person" title="second person">2nd</a></th>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82e%C5%9B#Polish" title="byłeś">byłeś</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82a%C5%9B#Polish" title="byłaś">byłaś</a></span></td>
            <td> 
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/byli%C5%9Bcie#Polish" title="byliście">byliście</a></span>
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82y%C5%9Bcie#Polish"
                        title="byłyście">byłyście</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/third_person" title="third person">3rd</a></th>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82#Polish" title="był">był</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82a#Polish" title="była">była</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82o#Polish" title="było">było</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/byli#Polish" title="byli">byli</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82y#Polish" title="były">były</a></span></td>
        </tr>
        <tr>
            <th rowspan="3"><a href="/wiki/future_tense" title="future tense">future tense</a></th>
            <th><a href="/wiki/first_person" title="first person">1st</a></th>
            <td colspan="3"><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%99#Polish"
                        title="będę">będę</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/b%C4%99dziemy#Polish"
                        title="będziemy">będziemy</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/second_person" title="second person">2nd</a></th>
            <td colspan="3"><span class="Latn" lang="pl"><a href="/wiki/b%C4%99dziesz#Polish"
                        title="będziesz">będziesz</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/b%C4%99dziecie#Polish"
                        title="będziecie">będziecie</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/third_person" title="third person">3rd</a></th>
            <td colspan="3"><span class="Latn" lang="pl"><a href="/wiki/b%C4%99dzie#Polish"
                        title="będzie">będzie</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%85#Polish"
                        title="będą">będą</a></span></td>
        </tr>
        <tr>
            <th rowspan="3"><a href="/wiki/conditional" title="conditional">conditional</a></th>
            <th><a href="/wiki/first_person" title="first person">1st</a></th>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82bym#Polish" title="byłbym">byłbym</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82abym#Polish" title="byłabym">byłabym</a></span></td>
            <td> 
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/byliby%C5%9Bmy#Polish"
                        title="bylibyśmy">bylibyśmy</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82yby%C5%9Bmy#Polish"
                        title="byłybyśmy">byłybyśmy</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/second_person" title="second person">2nd</a></th>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82by%C5%9B#Polish" title="byłbyś">byłbyś</a></span>
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82aby%C5%9B#Polish" title="byłabyś">byłabyś</a></span>
            </td>
            <td> 
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/byliby%C5%9Bcie#Polish"
                        title="bylibyście">bylibyście</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82yby%C5%9Bcie#Polish"
                        title="byłybyście">byłybyście</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/third_person" title="third person">3rd</a></th>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82by#Polish" title="byłby">byłby</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82aby#Polish" title="byłaby">byłaby</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82oby#Polish" title="byłoby">byłoby</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/byliby#Polish" title="byliby">byliby</a></span></td>
            <td><span class="Latn" lang="pl"><a href="/wiki/by%C5%82yby#Polish" title="byłyby">byłyby</a></span></td>
        </tr>
        <tr>
            <th rowspan="3"><a href="/wiki/imperative" title="imperative">imperative</a></th>
            <th><a href="/wiki/first_person" title="first person">1st</a></th>
            <td colspan="3"><span class="Latn" lang="pl">niech <a href="/wiki/b%C4%99d%C4%99#Polish"
                        title="będę">będę</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/b%C4%85d%C5%BAmy#Polish"
                        title="bądźmy">bądźmy</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/second_person" title="second person">2nd</a></th>
            <td colspan="3"><span class="Latn" lang="pl"><a href="/wiki/b%C4%85d%C5%BA#Polish"
                        title="bądź">bądź</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl"><a href="/wiki/b%C4%85d%C5%BAcie#Polish"
                        title="bądźcie">bądźcie</a></span></td>
        </tr>
        <tr>
            <th><a href="/wiki/third_person" title="third person">3rd</a></th>
            <td colspan="3"><span class="Latn" lang="pl">niech <a href="/wiki/b%C4%99dzie#Polish"
                        title="będzie">będzie</a></span></td>
            <td colspan="2"><span class="Latn" lang="pl">niech <a href="/wiki/b%C4%99d%C4%85#Polish"
                        title="będą">będą</a></span></td>
        </tr>
        <tr>
            <th colspan="2"><a href="/wiki/active" title="active">active</a> <a href="/wiki/adjectival"
                    title="adjectival">adjectival</a> <a href="/wiki/participle" title="participle">participle</a></th>
            <td><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%85cy#Polish" title="będący">będący</a></span>
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%85ca#Polish" title="będąca">będąca</a></span>
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%85ce#Polish" title="będące">będące</a></span>
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%85cy#Polish" title="będący">będący</a></span>
            </td>
            <td><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%85ce#Polish" title="będące">będące</a></span>
            </td>
        </tr>
        <tr>
            <th colspan="2"><a href="/wiki/contemporary" title="contemporary">contemporary</a> <a href="/wiki/adverbial"
                    title="adverbial">adverbial</a> <a href="/wiki/participle" title="participle">participle</a></th>
            <td colspan="5"><span class="Latn" lang="pl"><a href="/wiki/b%C4%99d%C4%85c#Polish"
                        title="będąc">będąc</a></span></td>
        </tr>
        <tr>
            <th colspan="2"><a href="/wiki/anterior" title="anterior">anterior</a> <a href="/wiki/adverbial"
                    title="adverbial">adverbial</a> <a href="/wiki/participle" title="participle">participle</a></th>
            <td colspan="5"><span class="Latn" lang="pl"><a href="/wiki/bywszy#Polish" title="bywszy">bywszy</a></span>
            </td>
        </tr>
        <tr>
            <th colspan="2"><a href="/wiki/verbal_noun" title="verbal noun">verbal noun</a></th>
            <td colspan="5"><span class="Latn" lang="pl"><a href="/wiki/bycie#Polish" title="bycie">bycie</a></span>
            </td>
        </tr>
    </tbody>
</table>
lightswitch05 commented 4 years ago

Looks fine to me:

[{"":"m","singular":"f","plural":"n"},{"":"infinitive","singular":"być","plural":"być"},{"":"1st","singular":"jestem,\n                -m","plural":"jestem,\n                -m"},{"":"2nd","singular":"jesteś, -ś","plural":"jesteś, -ś"},{"":"3rd","singular":"jest","plural":"jest"},{"":"1st","singular":"byłem","plural":"byłam"},{"":"2nd","singular":"byłeś","plural":"byłaś"},{"":"3rd","singular":"był","plural":"była"},{"":"1st","singular":"będę","plural":"będę"},{"":"2nd","singular":"będziesz","plural":"będziesz"},{"":"3rd","singular":"będzie","plural":"będzie"},{"":"1st","singular":"byłbym","plural":"byłabym"},{"":"2nd","singular":"byłbyś","plural":"byłabyś"},{"":"3rd","singular":"byłby","plural":"byłaby"},{"":"1st","singular":"niech będę","plural":"niech będę"},{"":"2nd","singular":"bądź","plural":"bądź"},{"":"3rd","singular":"niech będzie","plural":"niech będzie"},{"":"active adjectival participle","singular":"będący","plural":"będąca"},{"":"contemporary adverbial participle","singular":"będąc","plural":"będąc"},{"":"anterior adverbial participle","singular":"bywszy","plural":"bywszy"},{"":"verbal noun","singular":"bycie","plural":"bycie"}]

The template does this:

     var table = $('#students').tableToJSON();
     alert(JSON.stringify(table));  

which means - it only converts tables with the ID of students. Put an ID on your table element and try again.

onsa commented 4 years ago

Thanks.

It does parse now, that is into this:

[
  {
    "": "m",
    "singular": "f",
    "plural": "n"
  },
  {
    "": "infinitive",
    "singular": "być",
    "plural": "być"
  },
  {
    "": "1st",
    "singular": "jestem,\n                -m",
    "plural": "jestem,\n                -m"
  },
  {
    "": "2nd",
    "singular": "jesteś, -ś",
    "plural": "jesteś, -ś"
  },
  {
    "": "3rd",
    "singular": "jest",
    "plural": "jest"
  },
  {
    "": "1st",
    "singular": "byłem",
    "plural": "byłam"
  },
  {
    "": "2nd",
    "singular": "byłeś",
    "plural": "byłaś"
  },
  {
    "": "3rd",
    "singular": "był",
    "plural": "była"
  },
  {
    "": "1st",
    "singular": "będę",
    "plural": "będę"
  },
  {
    "": "2nd",
    "singular": "będziesz",
    "plural": "będziesz"
  },
  {
    "": "3rd",
    "singular": "będzie",
    "plural": "będzie"
  },
  {
    "": "1st",
    "singular": "byłbym",
    "plural": "byłabym"
  },
  {
    "": "2nd",
    "singular": "byłbyś",
    "plural": "byłabyś"
  },
  {
    "": "3rd",
    "singular": "byłby",
    "plural": "byłaby"
  },
  {
    "": "1st",
    "singular": "niech będę",
    "plural": "niech będę"
  },
  {
    "": "2nd",
    "singular": "bądź",
    "plural": "bądź"
  },
  {
    "": "3rd",
    "singular": "niech będzie",
    "plural": "niech będzie"
  },
  {
    "": "active adjectival participle",
    "singular": "będący",
    "plural": "będąca"
  },
  {
    "": "contemporary adverbial participle",
    "singular": "będąc",
    "plural": "będąc"
  },
  {
    "": "anterior adverbial participle",
    "singular": "bywszy",
    "plural": "bywszy"
  },
  {
    "": "verbal noun",
    "singular": "bycie",
    "plural": "bycie"
  }
]

So there are lots of empty keys (where <th> tags are parsed) and some of the <th>s are lost altogether (e.g. present indicative, past indicative, etc.). Am I still missing something?

Cheers!

lightswitch05 commented 4 years ago

I just looked at the HTML more closely @onsa - I don't think this tool can work with this type of table. There are way too many <th> spread throughout the table within different rows for it to work with this tool.