Closed LeMoussel closed 7 years ago
ok, i will check it later. Only on next week..
https://github.com/bopoda/robots-txt-parser/pull/21 PR with tests.
From google spec:
URL | allow: | disallow: | Verdict | Comments |
---|---|---|---|---|
http://example.com/page | /p | / | allow | RobotsTxtValidator - ok |
http://example.com/folder/page | /folder/ | /folder | allow | RobotsTxtValidator - ok |
http://example.com/page.htm | /page | /*.htm | undefined | detects as disallow at the moment. If it is undefined from google side, should we fix it? |
http://example.com/ | /$ | / | allow | RobotsTxtValidator - ok |
http://example.com/page.htm | /$ | / | disallow | RobotsTxtValidator - ok |
Good job. Thanks a lot 👍 Perhaps for Google undefined maybe Allow or Disallow. It depend on sorted robots.txt (see Issue #13)
I will add more tests from google Robots.txt Specifications . May be all. There a lot of cases. It will give information about library correctness.
Perhaps Google robots.txt file, can help build tests.
https://github.com/bopoda/robots-txt-parser/pull/22 As i see, all cases from table https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt?hl=en#example-path-matches looks correct except case-sensitive cases. Travis build: https://travis-ci.org/bopoda/robots-txt-parser/jobs/191336766.
in google spec:
The \<field> element is case-insensitive. The \<value> element may be case-sensitive, depending on the
element.
@LeMoussel should we make values case-sensitive? Now parser makes all them in lower case.
Yes make values case-sensitive (Rem: Url is case-sensitive). It's a bug to makes all them in lower case.
https://github.com/bopoda/robots-txt-parser/pull/23 RobotsTxtParser make values (Urls) case-sensitive https://github.com/bopoda/robots-txt-parser/pull/22 tests from google spec and fix RobotsTxtValidator to check allow/disallow urls as case-sensitive Now in master all tests from google specifications are passed.
From Google Robots.txt Specifications Order of precedence for group-member records