meeting-room-booking-system / mrbs-code

MRBS application code
Other
124 stars 61 forks source link

Serious Googlebot problem #339

Closed jberanek closed 1 year ago

jberanek commented 18 years ago

My log files for MRBS systems sites have exploded to gigabytes in size in one day. They are filled with errors like this: [Mon Sep 11 09:12:26 2006] [error] [client 66.249.65.163] FastCGI:

server "/www/fcgi-bin/neuromedia/php-fcgi" stderr: PHP Warning: > checkdate() expects parameter 1 to be long, string given in > /www/neuromedia/mnibooking/week.php on line 25 and [Mon Sep 11 09:27:04 2006] [error] [client 66.249.65.163] FastCGI: > server "/www/fcgi-bin/mni-bic/php-fcgi" stderr: PHP Warning: checkdate() expects parameter 1 to be long, string given in > /www/mni-bic/bicscheduler/day.php on line 21

There are similar errors in different files, but you get the idea. A > little further investigation shows that that ip address belongs to > Googlebot, so google crawling your site is really the thing putting the strain.

My university web systems administrator suggests I notify you with these problem. He suggested the following fixes: A quick solution would be to disallow crawlers using robots.txt. The better solution would be to fix those calls to checkdate. Some strategically placed calls to intval() should do the trick.

Is this something that has been addressed? Is there a fix for this?

Marcus McGill University

Reported by: mnisystems

Original Ticket: "mrbs/bugs/130":https://sourceforge.net/p/mrbs/bugs/130

jberanek commented 18 years ago

Logged In: YES user_id=1028383

We had exactly the same problem last week (error.log filled with Gigabytes of errors like yours)

Can you explain your solution with robots.txt?

Original comment by: zenou

jberanek commented 18 years ago

Logged In: YES user_id=71843

We should really have a robots.txt, I can't see a reason why a bot should be allowed to trawl through an MRBS installation.

Of course, robots.txt will only work if MRBS is installed at the root of the web server, so I'll add into the page headers too.

As for the actual warning causing the logs to grow, I have wrapped the variables with intval(), even though I don't see how those form variables would ever be non-integer.

John.

Original comment by: jberanek

jberanek commented 17 years ago

Logged In: YES user_id=71843 Originator: NO

Fix released in MRBS 1.2.4.

Original comment by: jberanek

jberanek commented 17 years ago

Logged In: YES user_id=1333197 Originator: NO

Do you really think this bug is fixed???

I still get this warnings and I think to stop bots crawling a MRBS installation should be the decision of each user and should not be forced. I'll try to find a solution in my "Facharbeit".

Please reopen this bug.

Original comment by: xhochy

jberanek commented 17 years ago

Logged In: YES user_id=71843 Originator: NO

MRBS is not designed to be crawled by bots, and I can imagine a bot could follow links in MRBS for a very long time, many of those pages being inappropriate things to trawl.

It is therefore my opinion that the only safe way to proceed is to make the robots instructions mandatory.

As for warnings, MRBS runs warning free for me in standard use, and I'm happy enough with that.

John.

Original comment by: jberanek