Attempting to parse rules that use \n as a line separator on a Windows machine fails, as the used PHP_EOL constant is \r\n on Windows, resulting in a single user-agent directive containing the full sitemap content.
Some form of line separator detection and normalisation should be used instead.
As a workaround, using the following RegEx before handing the robots.txt content to RobotsTxtParser will work:
$content = preg_replace("/\R/u", PHP_EOL, $content);
$parser = new RobotsTxtParser($content);
This RegEx will replace all Unicode newlines with the system newline, which the parser currently uses as well. This same solution could also be applied inside the parser to fix this issue.
Attempting to parse rules that use
\n
as a line separator on a Windows machine fails, as the usedPHP_EOL
constant is\r\n
on Windows, resulting in a singleuser-agent
directive containing the full sitemap content.This is caused by
RobotsTxtParser->prepareRules()
in line 148.Some form of line separator detection and normalisation should be used instead.
As a workaround, using the following RegEx before handing the
robots.txt
content toRobotsTxtParser
will work:This RegEx will replace all Unicode newlines with the system newline, which the parser currently uses as well. This same solution could also be applied inside the parser to fix this issue.