flightaware / Tcl-bounties

Bounty program for improvements to Tcl and certain Tcl packages
104 stars 8 forks source link

Scotty Improvements #1&2 - modernize build system, fix configure for -address and -port #26

Open jorge-leon opened 7 years ago

jorge-leon commented 7 years ago

I am starting to work on the Scotty build system to update it to current TEA (3.10), taking into account the comments in #7 .

jorge-leon commented 7 years ago

At this moment I have two branches in my clone of FlightAwares Scotty repository which build on Debian Jessie, FreeBSD 11 and Mac OS X Yosemite with Tcl stubs enabled.

Some of the included tests for Tnm run, sunrpc beeing notably an exeption since it segfaults. The results of the tests differ on different platforms and look strange on Mac OS X.

~~tkined starts up without the toolbar, so is not really usable. The problem might be with-- hardcoded paths in the libraries, which do not match TEA's idea about standard locations.~~ This was just a missing tkined.default file in the install target.

Interestingly, tkined segfaulted on a FreeBSD w/o X-org installed in the call to TkpGetSubfonts(), when launched via ssh and X11 forwarding.


I was able to contact J. Schönwalder and Mark Newnham. Ideas is to merge the work of Mark (icons, mostly) and to link history from 2000 to now (Scotty page in Tcler's Wiki).

jorge-leon commented 7 years ago

At this moment I have a fork of the FlightAware scotty repository which hopefully complies with the bounty requirements for both Scotty bounties.

Scotty has been split into two individual directories: tnm and tkined. Inside each directory you can do:

$ autoconf
$ ./configure options ...
$ make
$ make install

Tnm has received most love, you can run make test to run the testsuite, which should terminate successfully on Debian jessie, FreeBSD 11 and Mac OSX 10.10.5.

The TODO file in each directory lists all problems encountered with the software, there will be much more though...

jorge-leon commented 7 years ago

On Tnm::udp:

udp# configure does not set -address or -port on purpose. udp# send has two different use cases:

  1. You want to send packets to different host/port combinations with your specific udp# object.
  2. You want do send packets always to the same host/port combination.

In case 2. it is supposed to use udp# connect host port first. The udp# object is then said to be 'connected'. All future invocations of udp# send must not specify hostand port and always go to the same destination.

Note that the remote data hostand port are not stored in the udp# object, but rather in the socket associated with the udp# socket.

The very simple approach of my patch is to:

Notes:

jorge-leon commented 7 years ago

I just created a pull reqest with what I think is a completion of the given tasks.

Scotty needs some love though, but with the new build this should be easier to manage now.

For the upd bounty I have ideas for alternative implementations/funcionality, but maybe you try this on out first.

resuna commented 7 years ago

It needs some cleanup. I was going through the commits and noticed that there were some places where files were moved to backup files, and then new copies of files inserted. There's three backup files actually checked in to the repo:

tkined/generic/tkiPort.h~ tnm/generic/tnmPort.h~ tnm/tests/dns.test~

Can you clean that up, or at least verify that noting of value was lost?

jorge-leon commented 7 years ago

Oh! Fixed that one, nothing lost here.

Will countercheck right now, if all files are here against the original version.

resuna commented 7 years ago

"make test" in Tnm should be pulling in the version in the source tree, not the installed version, because that's what you're testing.

Tests began at Mon Feb 27 20:05:43 UTC 2017 dns.test couldn't read file "/usr/local/lib/Tnm3.0.2/library/init.tcl": no such file or directory icmp.test command "Tnm::dns" already exists job.test command "Tnm::dns" already exists map.test command "Tnm::dns" already exists

jorge-leon commented 7 years ago

Files are all there.

"make test":

The original scotty works that way, it explicitely requires you to install first. The reason is, you need nmicmpd and nmtrapd install setuid root.

However I think you are right and will rewrite this part of the Makefile accordingly. Do you think it is okay to set these programs setuid root in the source tree first and then install them into their final destination?

Note: nmtrapd seems not to be used by the tests.

resuna commented 7 years ago

Ah, OK.

There's a "make check" that seems to run out of the current directory, but it is SEGV-ing in scotty for me.

$ make check *** Signal 11

Stop.

jorge-leon commented 7 years ago

That is, because it uses the C - scotty. C - scotty currently segfaults, I have added tclscotty to compensate for now. See tnm/TODO.

resuna commented 7 years ago

Seems like there's a bit of work to do on this still.

jorge-leon commented 7 years ago

Yes, most of the items in the TODO files are already present in the original source from where I started, e.g. scotty spins about a minute on my machine before presenting a prompt, ... So I focussed on getting the library to compile and all tests running (with tclscotty).

I need feedback where the priorities are. Do you need smx? libsmi integration?

resuna commented 7 years ago

We need something that is functionally complete and has no regressions on FreeBSD and Linux compared with the FlightAware version. What items in TODO are in the code you started from, and what items are new?

Mac OS X may be required too.

jorge-leon commented 7 years ago

To find out exactly what are the differences I will run the tests and compare. May I suppose you are using FreeBSD 11 as primary development and test environment? Please specify which Linux distribution in which version you are using, and with which kernel version.

Is it ok to start testing with tclscotty? The original C scotty has no stub support and is quite a rewrite of of tclsh. So I don't know how long it takes to get this stable. The only real difference of C scotty with respect to package require Tnm is a modified event loop which prioritizes file events.

resuna commented 7 years ago

If you have to run the tests using tclscotty, then the Makefile should probably be running them that way too. Possibly the original scotty could be deprecated, if it's that obsolete. I'll check whether that's actually required.

jorge-leon commented 7 years ago

Please comment on development and testbed environments.

jorge-leon commented 7 years ago

Creating a new C-scotty was easier then I thought - it's already commited to my scotty repository.

The Tnm library hardcodes the absolute path to itself and uses it to construct further paths from where e.g. library/init.tcl is loaded. This might be messy, but it is in the original scotty code and I think we should check functionalty and regression against the current code base before (potential incompatible) improvements are made.

Find attached test logs made with a current checkout of FlightAwares' scotty repository on a FreeBSD 11 VirtualBox. test.log is created with make check and is incomplete because the sunrpc tests segfault. So I sourced each test file manually into the scotty binary and logged the results into test_individually.log. Please compare this with your results to see, if we can use this as a starting point.

After this I installed my current version and run tests with make test, logging into test_Tnm3.0.2.log. This would be a test of the Tnm library alone.

make check uses the scotty binary created in the source tree, its output can be found in test_scotty3.0.2.log. test.log.txt test_individually.log.txt test_scotty3.0.2.log.txt test_Tnm3.0.2.log.txt

resuna commented 7 years ago

FreeBSD 10.3, 11.0 macOS Sierra Linux - investigating

resuna commented 7 years ago

Ubuntu 16.10

jorge-leon commented 7 years ago

Can we agree on starting from FreeBSD 11. I would like you to run make check with the original FlightAware version in your environment and post the result, so we can compare.

resuna commented 7 years ago

Most of our environment is 10.3. I'm not sure I have an 11 environment I can test in.

jorge-leon commented 7 years ago

No problem: I'll set it up on 10.3.

resuna commented 7 years ago

make-check-original-freebsd-10.3.txt

jorge-leon commented 7 years ago

Do you have a 32 bit or 64 bit envorinment? The mib-7.14 mib format test (failure) would probably indicate the latter. Please post uname -a (mangled at your will) so I can set up accordingly.

resuna commented 7 years ago

VM I'm using for testing is:

FreeBSD vm-resuna1 10.3-RELEASE FreeBSD 10.3-RELEASE #0 r297264: Fri Mar 25 02:10:02 UTC 2016 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64

I noticed the mib format failures and they seem not a big deal. Same with most of the netdb failures. I don't think the bounty needs that kind of existing stuff to be cleaned up.

resuna commented 7 years ago

The new code doesn't seem to be pulling in configuration right, in particular it's redefining socklen_t when it shouldn't be.

unix/config.h is being generated with #define HAVE_SOCKLEN_T 1

But it looks like nothing is pulling that info into tnmInt.h so it's going:

ifndef HAVE_SOCKLEN_T

typedef size_t socklen_t;

endif

Which is why I think... when you build you get a bunch of stuff like this:

In file included from unix/nmtrapd.c:36:0: /usr/include/sys/socket.h:609:9: note: expected 'socklen_t ' but argument is of type 'int ' ssize_t recvfrom(int, void , size_t, int, struct sockaddr __restrict, socklen_t * __restrict);

jorge-leon commented 7 years ago

Great :) I have been cleaning up things I encounted on the way wherever it was obvious how to do it. And there is a lot more to do to get scotty state of the art of todays Internet and operating systems. e.g. interface detection and enumeration, which is related to IP address detection. scotty relies on /etc/hosts /etc/networks, etc. files which are not maintained coherently on todays systems.

resuna commented 7 years ago

Aha, just checked the old version, it had the same problem. I'll work with Karl on that.

jorge-leon commented 7 years ago

Original code uses config.h, however it did not pick it up everywhere correctly so I disabled config.h generation and all #defines go on the compilers commandline. Personally I thought investigating in depth at some moment and switch back to config.h.

Anywhere: HAVE_SOCKLEN_T is definitely defined on the compiler commandline. I will go through the source and check things.

Note that virtually no code has been changed. Just the autconf/configure/Makefile.in stuff.

jorge-leon commented 7 years ago

The mentioned problem in fact seems to show up precisely because we HAVE defined HAVE_SOCKLEN_T and use it. socklen_t is supposed to be unsigned 32 bit, see e.g. http://pubs.opengroup.org/onlinepubs/7908799/xns/syssocket.h.html.

The code repeatedly does: len = sizeof(some struct). sizeof returns size_t, which is, by definition, unsigned.

If we wouldn't have defined HAVE_SOCKLEN_T, Jürgen had redefined socklen_t to size_t for us, which is unsigned and we'd get a different compiler warning in the same place :)

Note: the original build method does not use -Wall, TEA(3.10) does, so we get a lot more warnings, all worth looking into.

P.D.: found the remedy: define len as socklen_t len, instead of int len wherever appropiate.

resuna commented 7 years ago

HAVE_SOCKLEN_T means it gets socklen_t from sys/socket.h which is a system file and by definition should be defining the correct type.

Edit: yeh, that should work. Weird place to get the error though.

Also, not a requirement for the bounty, because it's an existing condition.

I'm going to try using it and see if anything blows up.

jorge-leon commented 7 years ago

I have built and run the tests now on 64bit FreeBSD10.3. The test result is attached as test_scotty_freebsd10.3.log.txt

A summarized comparison can be found in compare.txt

Result: we got bug-compatibility :)

Note that tests are killed by the OS in the netdb.test step. I might suggest you run the remainig tests by hand and feed the output into a file so we can compare them too. They would be: ntp.test, snmp.test, sunrpc.test, syslog.test and udp.test.

Find my results in test_scotty_freebsd10.3_individual.log.txt

Oops: ntp.test missing. I post the output here:

$ ./scotty tests/ntp.test 
ntp.test:   Total   11  Passed  10  Skipped 1   Failed  0
Number of tests skipped for each constraint:
    1   ntpNotAvailable
resuna commented 7 years ago

Nice patches. Ignore my comment about the first, I see what you did there.

While you're in there, might want to do this:

diff --git a/tnm/unix/nmicmpd.c b/tnm/unix/nmicmpd.c
index ef70027..64c1a82 100644
--- a/tnm/unix/nmicmpd.c
+++ b/tnm/unix/nmicmpd.c
@@ -1181,7 +1181,7 @@ InitSockets()
     struct protoent *proto;
     int icmp_proto = 1;                        /* Capt'n Default */
     struct sockaddr_in maddr;
-#ifndef linux
+#if !defined(nec_ews) && !defined(linux) && !defined(USE_DLPI)
 #ifdef IP_HDRINCL
     int on = 1;                                /* karl */
 #endif
@@ -1511,10 +1511,8 @@ LookupNextRetryTimeout(struct timeval *tv)
     jobElem *job;
     struct timeval now;
     int min_diff = -1;
-    int window;

     gettime(&now, return);
-    window = GetWindow(0);

     for (job = job_list; job; job = job->next) {
         int tdiff;
jorge-leon commented 7 years ago

Patch applied. Whom may I blame in the commit message? Karl, you?

resuna commented 7 years ago

Me.

resuna commented 7 years ago

Looks like it needs some work on ubuntu.

$ tclsh all.tcl
Tcl 8.6.6 tests running in interp:  /usr/bin/tclsh
Tests running in working dir:  /home/peter/git/scotty/tnm/tests
Only running tests that match:  *
Skipping test files that match:  l.*.test
Only sourcing test files that match:  *.test
Tests began at Wed Mar 01 10:05:16 CST 2017
dns.test
Segmentation fault (core dumped)

For reference, this is with ubuntu 64-bit server 16.10 (yakkety), base install + make, autoconf, tcl-dev (which pulls in gcc).

Linux vm-ubuntu-resuna 4.8.0-22-generic #24-Ubuntu SMP Sat Oct 8 09:15:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

jorge-leon commented 7 years ago

Yes, both scotty and package require Tnm segfault, so it'll be the later who causes the problem. I tested on 32bit Ubuntu - same result.

jorge-leon commented 7 years ago

Bad news for me with macOS Sierra: can't upgrade, my MacMini (Late 2009) is too old :(

With the current MacOSX Yosemite and Xcode 5.0.2 I get more or less the same results as on FreeBSD 64bit.

resuna commented 7 years ago

macOS is lower priority than linux, so that's cool.

resuna commented 7 years ago

To build the old version of scotty on linux, if you need to do that for any reason, you need to un-comment-out (remove the #) two "# -lresolv"s in Makefile. The new version gets that right.

resuna commented 7 years ago

Here's the problem with it core dumping, looks like an old bug if it can't get a host name:

diff --git a/tnm/generic/tnmInit.c b/tnm/generic/tnmInit.c
index e9efb7d..4fa96ad 100644
--- a/tnm/generic/tnmInit.c
+++ b/tnm/generic/tnmInit.c
@@ -136,13 +136,15 @@ InitVars(Tcl_Interp *interp)

     tmp = ckstrdup(Tcl_GetHostName());
     p = strchr(tmp, '.');
-    if (p) *p = '\0';
-    Tcl_SetVar2(interp, "tnm", "host", tmp, TCL_GLOBAL_ONLY);
+    if (p) {
+       *p = '\0';
+       Tcl_SetVar2(interp, "tnm", "host", tmp, TCL_GLOBAL_ONLY);

-    /*
-     * Set the default domain name, if any
-     */
-    Tcl_SetVar2(interp, "tnm", "domain", ++p, TCL_GLOBAL_ONLY);
+       /*
+        * Set the default domain name, if any
+        */
+       Tcl_SetVar2(interp, "tnm", "domain", ++p, TCL_GLOBAL_ONLY);
+    }

     ckfree(tmp);
mutability commented 7 years ago

That doesn't look right, what about the domainless case?

resuna commented 7 years ago

Oh, yeh.

Much simpler, actually:

diff --git a/tnm/generic/tnmInit.c b/tnm/generic/tnmInit.c
index e9efb7d..727db53 100644
--- a/tnm/generic/tnmInit.c
+++ b/tnm/generic/tnmInit.c
@@ -142,7 +142,8 @@ InitVars(Tcl_Interp *interp)
     /*
      * Set the default domain name, if any
      */
-    Tcl_SetVar2(interp, "tnm", "domain", ++p, TCL_GLOBAL_ONLY);
+    if(p)
+       Tcl_SetVar2(interp, "tnm", "domain", ++p, TCL_GLOBAL_ONLY);

     ckfree(tmp);

Or should it set the variable to the empty string if there's no domain?

resuna commented 7 years ago

Aha, the version in FlightAware doesn't set the domain here. The bug was introduced in this commit:

https://github.com/jorge-leon/scotty/commit/e17ceb3f16cf322f54334802701d5ea7d42ce505

I'm not sure I entirely grok this commit, it seems to cause a regression on Windows, since tnm(dns) is no longer initialized. This may not be critical, since it doesn't seem to be initialized anywhere on UNIX. This fails on the old version too:

% puts $tnm(dns)
can't read "tnm(dns)": no such element in array

On the other hand, this comment seems worrying:

/*
 * Get the list of name servers from the Windows registry and set
 * the list of default DNS servers. Note, this should be done
 * automatically by the resolver but it is obviously not on
 * Windows machines.
 */
resuna commented 7 years ago

There's a couple of places this variable is used without checking if it exists, in the code, but setting tnm(domain) to null won't produce valid results:

https://github.com/flightaware/scotty/search?utf8=%E2%9C%93&q=%22tnm%28domain%29%22

jorge-leon commented 7 years ago

The mentioned commit follows the following rationale:

  1. Tcl already implements a platform independent way to get the hostname/domainname, if available, which is Tcl_GetHostName().

  2. Tnm had its own method, which only works if DNS resolution is set up correctly. Maybe at that time Tcl_GetHostName was not available or not trusted.

  3. So let's get rid of platform specific initialization of this variables and put it into the generic code.

  4. Since Windows is not the current target, for the moment leave the Windows initialization alone. It should however come down to deleting some superfluous lines of code.

    ~~Failing to check for a null pointer here is embarrassing (blush...), but why don't we get a hostname back on Ubuntu?!~~ I got it: its only the missing domain.

resuna commented 7 years ago

Hostname is there, domain name isn't.

peter@vm-ubuntu-resuna:~/git/scotty/tnm$ hostname
vm-ubuntu-resuna
peter@vm-ubuntu-resuna:~/git/scotty/tnm$ domainname
(none)

Question is, why did it work on FreeBSD:

peter@vm-freebsd-resuna1:~/git/scotty/tnm % hostname
vm-freebsd-resuna1
peter@vm-freebsd-resuna1:~/git/scotty/tnm % domainname
jorge-leon commented 7 years ago

What happens if ckalloc (inside ckstrdup) runs out of memory, and returns a NULL pointer?

A quick google does not get me to a definition what strchr() does with a NULL pointer as input string.

resuna commented 7 years ago

The strchr() doesn't have a null pointer on input. p is the output of the strchr(). tmp isn't null. p is.

On FreeBSD:

peter@vm-resuna1:~/bounty/git/scotty/tnm % tclsh
% package require Tnm
3.0.2
% puts [list $tnm(host) $tnm(domain)]
vm-resuna1 gwp.corp.flightaware.com

So Tcl_GetHostName() is getting a domain on FreeBSD but not on Linux. The domain name isn't set in domainname(1) on either platform.

resuna commented 7 years ago

This may be why there's all that extra code in the platform-specific initialization.