tykeal / TykeBot

Ruby based XMPP (Jabber) singlet & MUC chat bot
12 stars 2 forks source link

Can't connect to openfire? #12

Closed rchady closed 11 years ago

rchady commented 11 years ago

I"m trying to connect tykebot to an openfire server. I've tried version 3.6.4 and 3.7.1 and in both cases tykebot just hangs when connecting. Both servers require a secure connection to connect, is there something that needs to be done to make this happen with tykebot?

hartzler commented 11 years ago

try setting jabber_debug : true and debug : true in the config... does that shed any light?

rchady commented 11 years ago

Well, tells me what I was pretty sure of already, it is stuck doing TLS:

D, [2013-01-17T12:23:05.870864 #16931] DEBUG -- : RECEIVED:

D, [2013-01-17T12:23:05.871868 #16931] DEBUG -- : TLSv1: OpenSSL handshake in progress

It then never proceeds past that point...

tykeal commented 11 years ago

Did you set the ca_file option in the config pointing to a copy of the CA that signed your servers certificate?

rchady commented 11 years ago

No, did not see any documentation on that -- can you point me at those?

tykeal commented 11 years ago

Hrmm... looks like we need to update the sample config file. It's woefully out of date.

--[cut]-- ca_file : '/etc/pki/tls/certs/ca-bundle.crt' --[/cut]--

That assumes that your server is signed by one of the big CAs out there and that you're on a RedHat based system ;)

If you aren't, then get a copy of the CA certificate and just set the option in your config to point to the file.

rchady commented 11 years ago

Great, now to go find that. We are using a self signed certificate...

rchady commented 11 years ago

Duh, self signed certs don't have a cacert. Is there a way to make this accept the self signed cert and continue?

tykeal commented 11 years ago

Set the ca_file anyway. My server is running with the self signed certs as well. We don't run into that problem, then again, it might be because we're running the bot on the same system as the server :-/

tykeal commented 11 years ago

Out of curiosity under what Ruby environment are you running the bot?

rchady commented 11 years ago

ruby-1.8.7 currently...

tykeal commented 11 years ago

Well, hartzler did some testing and says that ruby 1.8.7 MRI should work. We run the bot under jruby 1.6.3 (see the .rvm file). Also, a little digging, I forgot the ca_file option is for one of the plugins, we don't use it for the bot itself. It should accept any certificate it sees, or possibly under jruby, you'll need to import the certificate into your cert store as a validated certificate since it's self-signed and java dislikes those in general.

What OS are you doing this on? There may be a global certificate store that you could import the public cert into as mark it as valid.

rchady commented 11 years ago

This is under openSuSE 11.4. I'm painfully aware that java does not like self-signed certs as I've had issues with it in openfire before. I'm going to do some more digging and see if I can get this to work. The bizarre part is it just hangs there indefinitely, never doing a thing. I would expect it to spit an error or say it can't do something rather than just hang. Running a strace on the process shows it is hung doing a select().

tykeal commented 11 years ago

rchady: Did you ever get this working? I still haven't been able to reproduce the issue myself so I'm curious.

rchady commented 11 years ago

Nope, tried it against 2 different openfire servers and it just hung in the same spot for both.

tykeal commented 11 years ago

So no errors, it just hangs? You may, if you like try registering an account against our OpenFire server (it's also running a self-signed cert) available @ bardicgrove.org with the standard ports. We've got a room set aside for bot testing bottest@conference.bardicgrove.org. The server accepts federation in from Google and other services as well if you don't want to register an account against it.

rchady commented 11 years ago

Was out all last week with the flu so will be spending most of this week playing catchup. Once caught up, I'll try to revisit this so maybe we can identify what is causing this behavior.

rchady commented 11 years ago

Ok, I finally got around to testing this out. Connecting to your server, it connects just fine, though I will mention that it did spit this error (unrelated to my current issue):

2013-02-11 16:57:05 [BOT] ERROR: failed initialing plugin: stats undefined method []' for nil:NilClass plugins/stats.rb:57:inload' ./lib/tykebot.rb:194:in call' ./lib/tykebot.rb:194:ininit_plugins' ./lib/tykebot.rb:191:in each' ./lib/tykebot.rb:191:ininit_plugins' ./lib/tykebot.rb:212:in run' ./lib/tykebot.rb:205:inloop' ./lib/tykebot.rb:205:in `run' ./tykebot.rb:18

So now my question is... how do you have openfire configured? :) I have a pretty standard setup and as I mentioned I have tried it against 2 openfire servers, one totally unrelated to my production one.

tykeal commented 11 years ago

Truth to tell, I switched out OpenFire to ejabberd just over a week ago as OpenFire has been having several issues on my server for many months and I was tired of dealing with them. I had the server configured to require secure connections on the default port sets (5222 with StartTLS or 5223 with legacy SSL).

That error that the bot spits out on connect now is related to ejabberd wanting (thought not requiring) a little more information of locally registered clients and the bot doesn't handle it.

Is it possible for me to try connecting to one of the systems you're using? If you don't want to send the connection information via the issue send it to me via email, my address is listed on my public profile though I do have a human test in front of the mailbox ;)

rchady commented 11 years ago

Hah, would one of those issues be running out of java heap space frequently? Since I upgraded to 3.7.1 I've been having major issues... pretty crazy that I have it set to 1GB now for only like ~20 connections at any given time.

Anyhow, I can't give you access to our internal server, but I may be able to add you to another server I admin. I'll let you know in email once I get approval from the owner of the system it is on.

tykeal commented 11 years ago

It wasn't running out of heap space for me. What kept happening was that the federation component would unroll and not throw any errors. But once it did that none of my federated users could actually connect into the system and since over 50% of the folks that use my system are federating in from google it wasn't in the best interests of my sanity to have to constantly be restarting OpenFire at random intervals.

rchady commented 11 years ago

Oh fun! I dunno what the difference is, but I have had to go from 256M -> 1024M just to keep the thing up. I have monit set up right now to monitor for it puking and restart it.

Anyhow, the owner of the other system is not available until tomorrow. I"ll ask him then.

rchady commented 11 years ago

I was wrong, the guy got back to me. You should have an email in your inbox now.

tykeal commented 11 years ago

rchady: Glad I could help you get the bot up and working. I'll go ahead and close this.