flightaware / Tcl-bounties

Bounty program for improvements to Tcl and certain Tcl packages
104 stars 8 forks source link

Scotty Improvements #1&2 - modernize build system, fix configure for -address and -port #26

Open jorge-leon opened 7 years ago

jorge-leon commented 7 years ago

I am starting to work on the Scotty build system to update it to current TEA (3.10), taking into account the comments in #7 .

jorge-leon commented 7 years ago
tmp = ckstrdup(Tcl_GetHostName()); <-- no memory here, 
p = strchr(tmp, '.');              <-- NULL pointer input to strchr here
resuna commented 7 years ago

That's not happening.

tmp = "vm-ubuntu-resuna" p = NULL // because that's the output of strchr(tmp, '.') TclSetVar2(interp, "tnm", "domain", ++p /* INVALID POINTER_ */, TCL_GLOBAL_ONLY);

resuna commented 7 years ago

Using the old version of Tnm it works:

% package require Tnm
3.0.1
% puts [list $tnm(host) $tnm(domain)]
vm-ubuntu-resuna gwp.corp.flightaware.com

So we're back to Tcl_GetHostName being "not trusted", on Linux.

jorge-leon commented 7 years ago

I understand, that OOM is not the case here. Your patch resolves the situation of a missing domain part in the return value of Tcl_GetHostName().

In our test environments it is unlikely to get an OOM. But how should situations be handled, where memory is allocated but the result is not checked?

resuna commented 7 years ago

We can fix that possible failure mode, yes. I wouldn't be surprised if there were not lots of other unchecked allocations in the existing legacy code. The underlying problem here is the regression in Linux (and possibly Windows) caused by checkin e17ceb3. That needs to be reverted or have some kind of fallback that gets the domain.

mutability commented 7 years ago

ckstrdup calls ckalloc; ckalloc panics if it is OOM; caller doesn't need to do anything?

resuna commented 7 years ago

The behavior of "info hostname" on Linux and FreeBSD is also inconsistent.

jorge-leon commented 7 years ago

I propose to initialize to recognizable invalid values, just like done with the user name or the os. A hostname of "unknown" and an empty string for the domain should do.

The original code uses the rather not standard res_ninit() function, which essentially reads /etc/resolv.confor the environment variable LOCALDOMAIN. https://linux.die.net/man/3/libbind-resolver

Tcl_GetHostName() on unix uses either uname(2) or gethostname(). The result with respect to the domain part depends on configuration, libc and operating system.

So: "unkown" and ""?

resuna commented 7 years ago

Hostname is not an issue. Whatever is returned by Tcl_GetHostName() is absolutely fine. It seems to be all [info hostname] does.

It doesn't need to work on Windows, at all, but Linux, I think if we can have it work there we should have it work there. Possibly something like:

if(p)
    Tcl_SetVar2(interp, "tnm", "domain", ++p, TCL_GLOBAL_ONLY);
#ifdef linux
else
    Tcl_SetVar2(interp, "tnm", "domain", linux_res_ninit_hack(), TCL_GLOBAL_ONLY);
#endif
jorge-leon commented 7 years ago

It is absolutely no problem to revert to the original version. In fact I stumbled about res_ninit() only because musl libc does not provide it. If we revert, scotty will not compile on Alpine Linux, that's it.

But I can't stop ranting:

If your system is not configured respectively (resolv.conf has no domain entry), res_ninit() will not give you a domain either.

IMHO it should be documented, how to set up the system on each OS to get the right thing out of Tcl_GetHostName() and rely on one centralized functionality which can be managed in the core instead of using just another heuristic which depends on os specific features.

jorge-leon commented 7 years ago

On Windows Tcl_GetHostName() uses the GetComputername() function, which returns the NetBIOS name of the local computer. So it is not supposed, that there be a domain name attached.

I think I stand corrected: Let's get the hostname with Tcl_GetHostName and the Internet domain name with a function provided by the Internet savvy Tnm library.

resuna commented 7 years ago

The machine I'm testing this on has a "search" line in resolv.conf, but no "domain" line. This is what dhcpd puts there on both FreeBSD and Linux.

Looking at the source to the version of res_ninit() on OS X and in glibc, in both cases it seems to be using the "search" line. In fact it uses search by preference, only falling back to domain if search isn't there.

So there's no configuration issue. These systems are correctly configured.

resuna commented 7 years ago

If you want to write the code to pull in the domain name using Tcl code from the Tnm library, as a fallback if Tcl_GetHostName() doesn't return a domain part, that's probably OK. In that case you should probably leave tnm(domain) unset in the "C" code and check on it later once you're mucking about in Tcl anyway.

resuna commented 7 years ago

Wouldn't the proper fix be to have an autoconf check for res_ninit() and make the code you removed from the Linux platform-dependent routine conditional on HAVE_RES_NINIT?

jorge-leon commented 7 years ago

That would be a possibility. Also there are workarounds to find out the default domain if res_ninit() is not available.

Sidenote: Read what resolv.conf(5) has to say about the search line...

jorge-leon commented 7 years ago

As of now I have reverted the code for setting the tnm(domain) variable completely to the original, to avoid more interference when testing.

I have done a general cleanup to eliminate most compiler warnings and put config.h into place. The build is now more quiet.

The netdb.tests suite gets killed by FreeBSD because it eats up all memory when running the netdb ip tests. Please disable netdb-6.6 and netdb-6.7 e.g. by putting in a "knownBug" constraint, or using my version. This allows the whole testsuite to finish.

I attach my respective logs from FreeBSD 10.3/amd64 and Ubuntu 16.10/i686

make_check_freebsd10.3.log.txt make_check_ubuntu16.10.log.txt

resuna commented 7 years ago

Our Linux boxes are x86_64/amd64

resuna commented 7 years ago

I'll get with Karl this afternoon to see about testing your fork.

I notice a bunch of errors on the FreeBSD side related to UDP sockets with an address and port. Are they related to the work you're doing with the UDP bounty?

Eg:

==== udp-11.2.3 udp configure/send: send to configured, connected FAILED
==== Contents of test case:

    catch {rename udp# {}}
    set r [udp create -myaddress 127.0.0.1 -myport $::SOME_PORT]
    rename [udp create -myaddress 127.0.0.1 -myport $::OTHER_PORT] udp#
    udp# connect 127.0.0.1 $::OTHER_PORT
    udp# configure -address 127.0.0.1 -port $::SOME_PORT
    udp# send "nase"
    $r receive

---- Test generated error; Return code was: 1
---- Return code should have been one of: 0 2
---- errorInfo: can not bind socket: address already in use
    while executing
"udp create -myaddress 127.0.0.1 -myport $::SOME_PORT"
    ("uplevel" body line 3)
    invoked from within
"uplevel 1 $script"
---- errorCode: POSIX EADDRINUSE {address already in use}
==== udp-11.2.3 FAILED
jorge-leon commented 7 years ago

Good to hear. If you are ready, please consider to look at the udp sendimprovements also.

The udp errors in the test are unrelated to the changes. The tests are made for a Linux network stack. FreeBSD apparently does not allow to send and receive on the same udp port at the same time.

There are other test with similar problems, e.g. using an arbitrary IP number out of 127.0.0.0/8, which works on Linux but not on other operating systems, unless you define an IP alias on the loopback interface.

resuna commented 7 years ago

OK, I've reviewed the UDP errors in the tests. I noticed this comment:

Note: these use the same udp endpoint to send and receive. Might just rewrite the tests.

jorge-leon commented 7 years ago

Comment is mine. Tests up to udp-9.1 are from the original scotty. I didn't want to modify them much in order to maintain comparability.

Testgroups 10 and 11 are for the new functionality. The tests work on the Linux network stack, I will adapt them to work on FreeBSD too.

resuna commented 7 years ago

If it's not a huge push-up, I think it would be better to make the previous tests consistent rather than not testing them on FreeBSD, since that's our primary platform and you're modifying 10 and 11 anyway.

jorge-leon commented 7 years ago

Updated udp tests and ran them on FreeBSD10.3/amd64, Debian Jessie, Ubuntu 16.10, MacOSX Yosemite.

Will need to update the documentation for the new udp# send functionality, since not all operating systems allow to sendto() from an already connected() socket.

With my recent cleanups tkined builds fail on MacOSX. Currently looking into it. Fixed by adding a missing HAVE_CONFIG_H stanza.

resuna commented 7 years ago

Coincidentally, I just made some changes in the base scotty to make it handle trailing . in host name lookups like the stock resolver. Can you pull these patches in?

https://github.com/flightaware/scotty/commit/85022ae752efdab38d474cd79dfd212919e76be7 https://github.com/flightaware/scotty/commit/a10621d35835e7dc6a017f90f6169f5ad05c8d32 https://github.com/flightaware/scotty/commit/d8703aee25445e5634132f5dff550cc7ffd3150c

PS: It's using res_mkquery/res_send instead of res_search to provide more control over the process, per Karl.

PPS: Thanks for the Mac fix, I'll check it out today.

resuna commented 7 years ago

On Sierra I get these errors.

Undefined symbols for architecture x86_64:
  "_Tcl_GetErrorLine", referenced from:
      _GeneratorCmd in tnmSnmpTcl.o
      _WalkTree in tnmMibTcl.o
ld: symbol(s) not found for architecture x86_64

Edit: See http://wiki.tcl.tk/22108

resuna commented 7 years ago

After applying the 22108 fix it builds and seems to work, tkined has a bit of a glitch in the menus.

image

If you can't see an easy fix, don't worry about it.

jorge-leon commented 7 years ago

I just have pulled in your commits from https://github.com/flightaware/Tcl-bounties/issues/26#issuecomment-285065113, do you have one for _Tcl_GetErrorLine handy?

Maybe this will get me to go with Tcl8.5 too.

In a minute I will test my recent patches for Tnm::dns on MacOSX and take a look on the menu issue.

resuna commented 7 years ago

I stuck the mod from the wiki into the files with the problem, but I haven’t sat down and gone “well, it should probably be in tnmSnmp.h or maybe config.h or blah blah blah”... you know, what’s the right place to put the thing.

Since you’ve had your nose in this more than me lately, I figure I’ll defer that to your judgement.

jorge-leon commented 7 years ago

Thats fine with me.

jorge-leon commented 7 years ago

With the last commit, I could build and run tests of scotty on:

The missing operating systems from your list would essentially be: Debian Sid, macOS Sierra. You asked for Debian 7.9 and Jessie 8.2 too: please confirm if you want me to test them. These are non updated version of Wheezy and Jessie.

Other OSses tested:

I'm in contact with Minix3, and hope they get me a working rpcgen/rcp implementation to test here too.


The upd send command fix and new functionality is in place and gets tested on all systems. The test file udp.test has a lists of systems, where udp connect and udp send with configured destination must not be mixed.

resuna commented 7 years ago

I've merged this into flightaware/scotty master.

resuna commented 7 years ago

Do we have stub support in this release?

jorge-leon commented 7 years ago

Everything is compiled with USE_TCL_STUBS and linked against tclstub8.6.