0x333333 / wiki-infobox-parser

A Wikipedia Infobox Parser
https://www.npmjs.com/package/wiki-infobox-parser
MIT License
19 stars 5 forks source link

TypeError: substring.split(...).shift(...).join is not a function #1

Closed EvanBoyle closed 8 years ago

EvanBoyle commented 8 years ago

Looks like there is some parsing logic that is failing. The specific case is "Dartmouth College". Culprit appears to be:

  /* Remove horizon list tag */
  if (item_content.indexOf('{{hlist') !== -1) {
    find = item_content.match(/\{\{hlist[^\}\}]*?\}\}/g);
    find && find.forEach(function(substring) {
      item_content = item_content.replace(substring, substring.split('|').shift().join(',').replace('}}', ''));
    });
  }

Haven't had time to debug this. While I appreciate the rapid progress you're making, I think you need a more robust test E2E test to run after making bit changes to things like parsing. Run the parser over 30 or 100K wiki pages after you make a change and see if there are any regressions. Pay attention to failure modes as well. Try to use more try/catch especially in the parser so that people who take dependencies on you don't experience application crashes.

0x333333 commented 8 years ago

Thanks a lot @EvanBoyle, it will be fixed soon. By the way, do you have any suggestion for the test cases?

0x333333 commented 8 years ago

Hi @EvanBoyle, I've added ~500 test cases to ensure that this parser won't crash under any circumstance. I will keep on adding more test cases.

EvanBoyle commented 8 years ago

Awesome thanks so much! I can try to put together a Jason file for you with all of the wiki pages I've been working with, about 30k. Won't be until next week though.

On Tuesday, October 20, 2015, Zhipeng JIANG notifications@github.com wrote:

Hi @EvanBoyle https://github.com/EvanBoyle, I've added ~500 test cases to ensure that this parser won't crash under any circumstance. I will keep on adding more test cases.

— Reply to this email directly or view it on GitHub https://github.com/zp-j/wiki-infobox-parser/issues/1#issuecomment-149742528 .

0x333333 commented 8 years ago

Cool, more tests are always welcomed. Thanks! :smile: