rhymeswithmogul / aprs-weather-submit

Manually submit weather station information to the APRS-IS network.
https://github.com/rhymeswithmogul/aprs-weather-submit/wiki
GNU Affero General Public License v3.0
20 stars 10 forks source link

Blocking "error in connect(): Connection timed out"... #12

Closed dl9sec closed 2 weeks ago

dl9sec commented 2 weeks ago

Hi Colin,

thanks for that great piece of software.

I want to use it within my home automation system ("FHEM") where I collect some environmental data, write it to a file and then cyclically call a bash script from my home automation system where the file is parsed, and a call of "aprs-weather-submit" with all the parameters is generated.

This works excellently so far except for one small thing: I obvserved, that sometimes it happens that the APRS-IS is connected but then something stalls in further communication:

Compiled with debugging output.

Connecting to 85.188.1.27:14580...
> user DL9SEC pass XXXXX vers aprs-weather-submit/1.7.2-git
< # javAPRSSrvr 3.15b08
< # logresp DL9SEC verified, server CWOP-5
> DL9SEC-31>APRS,TCPIP*:@281152z4839.70N/00946.75E_270/002g002t070r000p000P000h56b10275FHEM WX (Exp.) - Gingen/Fils
error in connect(): Connection timed out
error in connect(): Connection timed out

and it seems that with this timeout "aprs-weather-submit" doesn't return properly or hangs and therefore my homeautomation system (which is written in Perl, and the bash script is called by a forking system call which waits for the called program to return) hangs too. When I then kill the "aprs-weather-submit" process manually, the home automation continues.

I ask myself if it in case of that error (aprs-is.c line 160ff) the program should exit with failure code. Or what do you think? Would it be possible to properly exit the program in case of such an error?

Would be really helpful for me :-)

Thank you in advance.

Regards, Thorsten

dl9sec commented 2 weeks ago

It looks like I was a little too impatient... As far as I see there are three attempts. If the fail, the program returns. But the timeout time seems very long. Would it be possible to make the number of attemps and the timeout time configurable?

rhymeswithmogul commented 2 weeks ago

Yeah, I’ll take a look at it. Would you mind sharing the exact command line you're using? Also, have you ruled out any network issues on your end? (That being said, I'll have to test this app on a simulated poor network.)

dl9sec commented 2 weeks ago

Hi Colin,

thank you for taking a look at it.

The exact commandline was: aprs-weather-submit -M "FHEM WX (Exp.) - Gingen/Fils" -k DL9SEC-31 -I cwop.aprs.net -o 14580 -u DL9SEC -d XXXXXX -n 48.661667 -e 9.779167 -b 1027.5 -T 21.1 -h 56 -c 270 -S 2.1 -g 2.4 -P 0 -r 0 -p 0 (where XXXXX is my APRS-IS password)

I can rule out any network problems on my side. It seems that this effect is more pronounced between morning and afternoon. Perhaps the requests to the APRS-IS increase then...maybe...

The last connection problems appeard around 1600 UTC here:

Compiled with debugging output.

Connecting to 44.155.254.4:14580...
> user DL9SEC pass XXXXX vers aprs-weather-submit/1.7.2-git
< # javAPRSSrvr 3.15b08
< # logresp DL9SEC verified, server CWOP-3
> DL9SEC-13>APRS,TCPIP*:@281558z4839.70N/00946.76E_068/001g001t061r000p000P000h77b10266FHEM WX (Exp.) - Gingen/Fils
error in connect(): Connection timed out
error in connect(): Connection timed out
error in connect(): Connection timed out

Compiled with debugging output.

Connecting to 129.15.108.116:14580...
Connecting to 129.15.108.116:14580...
Connecting to 129.15.108.116:14580...
Connecting to 85.188.1.27:14580...
> user DL9SEC pass XXXXX vers aprs-weather-submit/1.7.2-git
< # javAPRSSrvr 3.15b08
< # logresp DL9SEC verified, server CWOP-5
> DL9SEC-13>APRS,TCPIP*:@281608z4839.70N/00946.76E_068/002g002t060r000p000P000h77b10267FHEM WX (Exp.) - Gingen/Fils

Since that time every transmission every ten minutes went well (currently I transmit with DL9SEC-13 instead of DL9SEC-31 which was experimental and changed the coordiantes to the position of my old WX station)...

Thank you.

Regards, Thorsten

rhymeswithmogul commented 2 weeks ago

It's trying to connect to a server that has multiple IP addresses. It tries to connect to one, then gives up and moves on. There might be a server issue or something, but let me see what I can do in my code to make it fail more gracefully.

dl9sec commented 2 weeks ago

That would be great, thank you :-)

rhymeswithmogul commented 2 weeks ago

I figured out how to make non-blocking I/O in C. It will time out after 15 seconds, but you can use the new --timeout parameter to pick one of your choosing. (Or, use a timeout of 0 for the old behavior.)

For what it's worth, cwop.aprs2.net seems to be the culprit here. It has many A/AAAA records, but not all of them appear to be working. You may be able to pick another of the Tier 2 servers (such as rotate.aprs2.net).

dl9sec commented 2 weeks ago

Hi Colin,

thanks for the effort. I checked out the branch and built it on my RasPi3. Testing in progress. Looks good so far. I also will try that "rotate" server adress...

Thank you.

Regards, Thorsten