digitalmethodsinitiative / dmi-tcat

Digital Methods Initiative - Twitter Capture and Analysis Toolset
Apache License 2.0
365 stars 114 forks source link

DMI-TCAT controller killed a process #2

Closed supersambo closed 10 years ago

supersambo commented 10 years ago

I'm constantly receiving emails saying that dmi-tcat killed the processes because the script was idle. This occured suddenly and I'm not aware of any profound changes made to my server, which may cause this problem. I'm getting the message every 5 minutes.

The log files do not point me to the problem. dmi-tcat/capture/stream/logs/error.log just contains:

2013-10-29 07:05:03 connecting to API socket 2013-10-29 07:05:03 connecting - query array ( 'track' => 'global warming,globalwarming,climate,climatechange,yasuni,yasuniitt', ) 2013-10-29 07:11:03 connecting to API socket 2013-10-29 07:11:03 connecting - query array ( 'track' => 'global warming,globalwarming,climate,climatechange,yasuni,yasuniitt', ) 2013-10-29 07:17:03 connecting to API socket 2013-10-29 07:17:03 connecting - query array ( 'track' => 'global warming,globalwarming,climate,climatechange,yasuni,yasuniitt', )

dmi-tcat/capture/stream/logs/controller.log states:

2013-10-29 07:05:01 script was idle for more than 300 seconds - killing and starting 2013-10-29 07:06:01 script called - pid:3038 idle:58 2013-10-29 07:07:01 script called - pid:3038 idle:118 2013-10-29 07:08:01 script called - pid:3038 idle:178 2013-10-29 07:09:01 script called - pid:3038 idle:238 2013-10-29 07:10:01 script called - pid:3038 idle:298 2013-10-29 07:11:01 script called - pid:3038 idle:358 2013-10-29 07:11:01 script was idle for more than 300 seconds - killing and starting 2013-10-29 07:12:01 script called - pid:3119 idle:58 2013-10-29 07:13:01 script called - pid:3119 idle:118 2013-10-29 07:14:01 script called - pid:3119 idle:178 2013-10-29 07:15:01 script called - pid:3119 idle:238 2013-10-29 07:16:01 script called - pid:3119 idle:298 2013-10-29 07:17:01 script called - pid:3119 idle:358 2013-10-29 07:17:01 script was idle for more than 300 seconds - killing and starting

the apache error log does not contain any relevant information either.

I tested my Twitter tokens and they work well with some python scripts to retrieve User Informations.

ErikBorra commented 10 years ago

Hi,

this most likely happens because one of your mysql tables is crashed and needs to be repaired. Try SHOW TABLE STATUS -- It will have NULLs for the tables that really need REPAIR or myisam_recover.

Best,

Erik

On Oct 29, 2013, at 11:59 AM, supersambo wrote:

I'm constantly receiving emails saying that dmi-tcat killed the processes because the script was idle. This occured suddenly and I'm not aware of any profound changes made to my server, which may cause this problem. I'm getting the message every 5 minutes.

The log files do not point me to the problem. dmi-tcat/capture/stream/logs/error.log just contains:

2013-10-29 07:05:03 connecting to API socket 2013-10-29 07:05:03 connecting - query array ( 'track' => 'global warming,globalwarming,climate,climatechange,yasuni,yasuniitt', ) 2013-10-29 07:11:03 connecting to API socket 2013-10-29 07:11:03 connecting - query array ( 'track' => 'global warming,globalwarming,climate,climatechange,yasuni,yasuniitt', ) 2013-10-29 07:17:03 connecting to API socket 2013-10-29 07:17:03 connecting - query array ( 'track' => 'global warming,globalwarming,climate,climatechange,yasuni,yasuniitt', )

dmi-tcat/capture/stream/logs/controller.log states:

2013-10-29 07:05:01 script was idle for more than 300 seconds - killing and starting 2013-10-29 07:06:01 script called - pid:3038 idle:58 2013-10-29 07:07:01 script called - pid:3038 idle:118 2013-10-29 07:08:01 script called - pid:3038 idle:178 2013-10-29 07:09:01 script called - pid:3038 idle:238 2013-10-29 07:10:01 script called - pid:3038 idle:298 2013-10-29 07:11:01 script called - pid:3038 idle:358 2013-10-29 07:11:01 script was idle for more than 300 seconds - killing and starting 2013-10-29 07:12:01 script called - pid:3119 idle:58 2013-10-29 07:13:01 script called - pid:3119 idle:118 2013-10-29 07:14:01 script called - pid:3119 idle:178 2013-10-29 07:15:01 script called - pid:3119 idle:238 2013-10-29 07:16:01 script called - pid:3119 idle:298 2013-10-29 07:17:01 script called - pid:3119 idle:358 2013-10-29 07:17:01 script was idle for more than 300 seconds - killing and starting

the apache error log does not contain any relevant information either.

I tested my Twitter tokens and they work well with some python scripts to retrieve User Informations.

— Reply to this email directly or view it on GitHub.

supersambo commented 10 years ago

Hi Erik, thank you for replying quickly. all of my tables had status NULL. apparently repairing them solved the problem!

thanks a lot

best, stephan

supersambo commented 10 years ago

hi erik, i had to reopen this because unfortunatly the problem remains. after repairing the tables it worked out well, but some hours later the problem appeared again. now the field check_time in Table Status (which stated NULL the last time) is showing a reasonable time stamp and repairing the tables has no effect.

best stephan

ErikBorra commented 10 years ago

Hi,

pardon the slow response!

Do you still have this issue? If not, how did you work around it?

Best,

Erik

On 31 Oct 2013, at 12:06, supersambo notifications@github.com wrote:

hi erik, i had to reopen this because unfortunatly the problem remains. after repairing the tables it worked out well, but some hours later the problem appeared again. now the field check_time in Table Status (which stated NULL the last time) is showing a reasonable time stamp and repairing the tables has no effect.

best stephan

— Reply to this email directly or view it on GitHub.

supersambo commented 10 years ago

Hi, no problem. Unfortunately I haven't found any solution not even a hint.

best, stephan

ErikBorra commented 10 years ago

Instead of running php controller.php in dmi-tcat/capture/stream/ could you try running php capture.php and send me any possible output?

Are you sure the PHP script can connect to your database? I.e. do you have the right credentials etc?

supersambo commented 10 years ago

when I run php capture.php I get PHP Deprecated: mysql_connect(): The mysql extension is deprecated and will be removed in the future: use mysqli or PDO instead in /home/supersambo/www/dmi-tcat/common/functions.php on line 5

but that shouldnt be a problem yet, right? I already captured a few million tweets before the issue ocurred for the first time, so I think credentials can't be the problem. Maybe my HDD isnt fast enough for this?

ErikBorra commented 10 years ago

That mysql warning should not be the problem. However, I’d be interested to know which PHP and MySQL you are running?

Can you access the web interface to the analysis modules?

Are the queries you are running super-high-volume? With our SSD setup we are currently capturing about one million tweets a day. It seems a HD should thus still be able to do a significant fraction of that. Next week I hope to set up a machine with a HD so that I can test the throughput of that.

You could try increasing the $idletime variable (in seconds) in dmi-tcat/capture/stream/controller.php but I’m afraid that won’t solve the issue.

Please let me know if you think of anything else, I’ll do the same :)

On 20 Nov 2013, at 16:58, supersambo notifications@github.com wrote:

when I run php capture.php I get PHP Deprecated: mysql_connect(): The mysql extension is deprecated and will be removed in the future: use mysqli or PDO instead in /home/supersambo/www/dmi-tcat/common/functions.php on line 5

but that shouldnt be a problem yet, right? I already captured a few million tweets before the issue ocurred for the first time, so I think credentials can't be the problem. Maybe my HDD isnt fast enough for this?

— Reply to this email directly or view it on GitHub.

ErikBorra commented 10 years ago

we realized that your queries might result in less than 100 tweets per 5 minutes. We have adjusted dmi-tcat/capture/stream/capture.php to fix this issue. See https://github.com/digitalmethodsinitiative/dmi-tcat/commit/206c88a035ab763e44de6ca345d37c943298cf72

supersambo commented 10 years ago

Hi Erik, Sorry for answering so late, again. I pulled recently but this had no effect. I also changed the $idletime, but as you guessed it didn't solve the problem.

Currently I'm using super-low-volume queries (a few hundred tweets a week).

I'm using php Version 5.5.3-1.

Mysql mysql> SHOW VARIABLES LIKE "%version%"; +-------------------------+------------------+ | Variable_name | Value | +-------------------------+------------------+ | innodb_version | 5.5.31 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.5.31-1 | | version_comment | (Debian) | | version_compile_machine | x86_64 | | version_compile_os | debian-linux-gnu | +-------------------------+------------------+

The interesting thing, I recently realized, is that it's actually capturing tweets although I'm constantly receiving 'DMI-TCAT controller killed a process'-mails.

ErikBorra commented 10 years ago

Hi,

if your queries are super-low volume, than that’s the reason why you get these mails. TCAT script expects to get at least 1 insert each five minutes. Try increasing $idletime in dmi-tcat/capture/stream/controller.php from 300 seconds (5m) to 43200 seconds (half a day).

Best,

Erik

On 05 Dec 2013, at 17:37, supersambo notifications@github.com wrote:

Hi Erik, Sorry for answering so late, again. I pulled recently but this had no effect. I also changed the $idletime, but as you guessed it didn't solve the problem.

Currently I'm using super-low-volume queries (a few hundred tweets a week).

I'm using php Version 5.5.3-1.

Mysql mysql> SHOW VARIABLES LIKE "%version%"; +-------------------------+------------------+ | Variable_name | Value | +-------------------------+------------------+ | innodb_version | 5.5.31 | | protocol_version | 10 | | slave_type_conversions | | | version | 5.5.31-1 | | version_comment | (Debian) | | version_compile_machine | x86_64 | | version_compile_os | debian-linux-gnu | +-------------------------+------------------+

The interesting thing, I recently realized, is that it's actually capturing tweets although I'm constantly receiving 'DMI-TCAT controller killed a process'-mails.

— Reply to this email directly or view it on GitHub.

supersambo commented 10 years ago

:) this is really funny. Can't believe this was the problem! I just set up "#xaver" and dmi-tcat is not sending any error message. Reconstructing the issue. First of all I had some damaged mysql tables, so I tried to solve the problem by lowering the volume which caused another problem (which turns out not to be a real problem).

Thank you very much for your pacience Erik! And sorry for bothering you so long! I'm happy to close this issue now!

best, Stephan