postlight / parser

📜 Extract meaningful content from the chaos of a web page
https://reader.postlight.com
Apache License 2.0
5.41k stars 442 forks source link

hyphen (dash) in domain name fails tests #566

Open Buratinator opened 4 years ago

Buratinator commented 4 years ago

I'm trying to create a custom parser for www.news-medical.net The custom parser is generated by running yarn generate-parser and passing the URL https://www.news-medical.net/news/20200524/Heparin-may-stop-SARS-CoV-2-infecting-host-cells.aspx Running yarn watch:test -- www.news-medical.net results in an error (see Current behavior below for output)

Expected Behavior

Since the test suite complains about failing to run and pointing at the hyphen in const WwwNews-medicalNetExtractor = associated with the domain name, I guess the const should not contain hyphens even if the domain name does.

Current Behavior

PS > yarn watch:test -- www.news-medical.net
yarn run v1.22.4
warning From Yarn 1.0 onwards, scripts don't require "--" for options to be forwarded. In a future version, any explicit "--" will be forwarded as-is to the scripts.
$ jest --watch www.news-medical.net
Browserslist: caniuse-lite is outdated. Please run next command `yarn upgrade caniuse-lite browserslist`
 FAIL  src/extractors/custom/www.news-medical.net/index.test.js
  ● Test suite failed to run

    Jest encountered an unexpected token

    This usually means that you are trying to import a file which Jest cannot parse, e.g. it's not plain JavaScript.

    By default, if Jest sees a Babel config, it will use that to transform your files, ignoring "node_modules".

    Here's what you can do:
     • To have some of your "node_modules" files transformed, you can specify a custom "transformIgnorePatterns" in your config.
     • If you need a custom transformation specify a "transform" option in your config.
     • If you simply want to mock your non-JS modules (e.g. binary assets) you can stub them out with the "moduleNameMapper" config option.

    You'll find more details and examples of these config options in the docs:
    https://jestjs.io/docs/en/configuration.html

    Details:

    SyntaxError: D:\OneDrive - AskAnalyst\WORK\Avogadro One\Git\MercuryParser\src\extractors\custom\www.news-medical.net\index.js: Unexpected token (1:20)

    > 1 | export const WwwNews-medicalNetExtractor = {
        |                     ^
      2 |   domain: 'www.news-medical.net',
      3 |
      4 |   title: {

      at Parser.raise (node_modules/@babel/parser/lib/index.js:4051:15)
      at Parser.unexpected (node_modules/@babel/parser/lib/index.js:5382:16)
      at Parser.parseVar (node_modules/@babel/parser/lib/index.js:8154:18)
      at Parser.parseVarStatement (node_modules/@babel/parser/lib/index.js:7964:10)
      at Parser.parseStatementContent (node_modules/@babel/parser/lib/index.js:7555:21)
      at Parser.parseStatement (node_modules/@babel/parser/lib/index.js:7505:17)
      at Parser.parseExportDeclaration (node_modules/@babel/parser/lib/index.js:8638:17)
      at Parser.parseExport (node_modules/@babel/parser/lib/index.js:8585:31)
      at Parser.parseStatementContent (node_modules/@babel/parser/lib/index.js:7592:27)
      at Parser.parseStatement (node_modules/@babel/parser/lib/index.js:7505:17)

Test Suites: 1 failed, 1 total
Tests:       0 total
Snapshots:   0 total
Time:        9.342s
Ran all test suites matching /www.news-medical.net/i.

Active Filters: filename /www.news-medical.net/
 › Press c to clear filters.

Watch Usage
 › Press a to run all tests.
 › Press f to run only failed tests.
 › Press o to only run tests related to changed files.
 › Press p to filter by a filename regex pattern.
 › Press t to filter by a test name regex pattern.
 › Press q to quit watch mode.
 › Press Enter to trigger a test run.

Steps to Reproduce

  1. Create a custom parser for https://www.news-medical.net/news/20200524/Heparin-may-stop-SARS-CoV-2-infecting-host-cells.aspx
  2. Run yarn watch:test -- www.news-medical.net

Detailed Description

If I edit the index.js for the custom parser, replacing the automatically created export const WwwNews-medicalNetExtractor = { with export const WwwNewsmedicalNetExtractor = {

The test suite runs successfully.

Possible Solution

My guess is that hyphens should be excluded from const names.