bopoda / robots-txt-parser

PHP class for parse all directives from robots.txt files according to specifications
http://robots.jeka.by
MIT License
44 stars 17 forks source link

Fix of not allowed char #56

Closed phenixiim closed 5 years ago

phenixiim commented 5 years ago

Brackets [], due to the fail in preg_match raise a warning and unexpected end

bopoda commented 5 years ago

@phenixiim Thanks for the PR!

I see that Travis' builds are red but that is not related to codebase changes but related to travisCI changes. I will sort CI out later.

Your commit makes sense and fixes the regexp' problem, that's why it is merged. I think that's still possible to improve regexp to avoid possible errors like preg_match(): Compilation failed:. For example, the characters ( ) should also be escaped. So it would be nice to find a common approach for escaping (anyway in another PR).

Thanks!

phenixiim commented 5 years ago

Thanks for you package!

What about this?

https://www.php.net/manual/en/function.preg-quote.php

Could not this to be the cure?

S přátelským pozdravem a přáním hezkého dne / Best regards Dalibor Jaroš

Dne 25.07.19 v 15:49 Eugene Yurkevich napsal(a):

@phenixiim https://github.com/phenixiim Thanks for the PR!

I see that Travis' builds are red but that is not related to codebase changes but related to travisCI changes. I will sort CI out later.

Your commit makes sense and fixes the regexp' problem, that's why it is merged. I think that's still possible to improve regexp to avoid possible errors like |preg_match(): Compilation failed:|. For example, the characters |(| |)| should also be escaped. So it would be nice to find a common approach for escaping (anyway in another PR).

Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bopoda/robots-txt-parser/pull/56?email_source=notifications&email_token=AEDF4AAB6SNZ2E3HZPYO5QDQBGVNXA5CNFSM4IGD7OPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2ZQ3RA#issuecomment-515050948, or mute the thread https://github.com/notifications/unsubscribe-auth/AEDF4ACQFJBTFDAHOJUL2M3QBGVNXANCNFSM4IGD7OPA.

bopoda commented 4 years ago

Hi @phenixiim, It makes sense to use preg_quote. I created PR for its https://github.com/bopoda/robots-txt-parser/pull/58/. To be honest, we should keep the characters *,$ not escaped. That's why some magic is still there.

User-Agent: *
Disallow: /fish*.php

Should match such an urls:

/fish.php
/fisher.php
/fishheads/catfish.php?parameters