Closed halfer closed 9 years ago
Good question. I'll look into that.
Sounds fair that the user agent string should be case insensitive in this instance.
Fixed in release 0.2
Many thanks Jon, I look forward to giving this a try.
No problem.
Out of interest, in what projects are you using this package?
I'm working on a crawler that allows data to be fetched using a site-specific set of fetch/processing commands edited in a web interface, and which hopefully will allow me to create structured datasets from a wide range of semi-structured pages. I don't know if it is a viable project yet :-)
but hope to have a prototype running in the next couple of weeks.
Awesome!
I've been scratching my head over this problem for a few hours! I tried this code initially:
It would pick up the
*
rules but not the BadBot-specific rules. However, this works:Now the docs say the UA string is case insensitive, but I'm definitely getting different results:
My test robots file:
I can use
strtolower()
for now, of course, but I imagine this would be better in the library. Have I made a mistake somewhere, or is this really case insensitive here?PHP 5.5, Ubuntu 13.10. My hacky testing was inside a project, but I can try creating a standalone script, if that's helpful.