Bill-Gray / lunar

Basic astronomical functions
https://www.projectpluto.com/source.htm
GNU General Public License v2.0
92 stars 46 forks source link

Some functions, like moon phases calcs return crazy results on RaspberryPi #3

Closed stwf closed 6 years ago

stwf commented 6 years ago

I don't think this is endian related (Raspian runs LE I believe) maybe due to the byte alignment issue?

Any help would be appreciated!

screen shot 2018-10-16 at 5 29 55 pm
Bill-Gray commented 6 years ago

@stwf Well, that's interesting... the rise/set times are correct for the sun; the moon appears to be seriously mangled. This code is only marginally tested on the Raspberry Pi. A gent asked about me this about a year ago; he'd seen garbage lunar and solar positions. Looking at my correspondence with him, we got the solar part working after adding code to ensure that byte alignment and order would be handled correctly. The lunar part worked, but I had my doubts about it; it seemed to me as if alignment should be an issue, but it didn't crop up back then. One thing you might try : run ./tables 2018, to get a table of sun and moon rise/set values for this year (at present, you're getting rise/set times for the year 10.) I'm thinking there may be a long-term component that overflows when going back 2000 years. In any case, I'm going to revise lunar2.cpp to access data "correctly", using the same tools that were used for computing solar/planetary positions in vsopson.c... will get back to you then. I probably should get a RaspPi; it seems to be a good platform for stress-testing code!

Bill-Gray commented 6 years ago

(Insert sound of hand thumping forehead) I wrote the above thinking that the necessary modifications to lunar2.cpp were going to take a while. Turned out to be quite easy. Please download https://www.projectpluto.com/xfer/lunar2.cpp over your existing copy, and try building using that. As described, this uses "safe" functions to extract integers and doubles from the binary blob, such that alignment and byte order will be handled correctly. Let me know how it goes. If it fails, we'll try again; if it works, I'll add, commit, and push the changes.

stwf commented 6 years ago

Thanks! You were right about the overflow. ./tables 2018 gave much closer results, but still the moon stuff was still off by a few minutes, even with the new code, and the sun calculations still seem identical.

screen shot 2018-10-16 at 10 35 13 pm
Bill-Gray commented 6 years ago

Hmmm... I thought for sure that was going to do it! Just to clarify : was the lunar data given by ./tables 2018 garbage before applying the new code? Did the new code make any actual difference? Are lunar tables for 10 AD still really messed up? Another thing worth trying : there's a phases program to produce tables of lunar phases. Running ./phases 2018 -m2019 should get you the following... does it? I've looked through the code, and don't see remaining byte alignment/word order issues. Doesn't mean there are none, though, and that's the automatic suspect when something works everywhere except on a Raspberry Pi.

18 Dec 2017  6:30  26 Dec 2017  9:20   2 Jan 2018  2:24   8 Jan 2018 22:25  
17 Jan 2018  2:17  24 Jan 2018 22:20  31 Jan 2018 13:26   7 Feb 2018 15:53  
15 Feb 2018 21:05  23 Feb 2018  8:09   2 Mar 2018  0:51   9 Mar 2018 11:19  
17 Mar 2018 13:11  24 Mar 2018 15:35  31 Mar 2018 12:36   8 Apr 2018  7:17  
16 Apr 2018  1:57  22 Apr 2018 21:45  30 Apr 2018  0:58   8 May 2018  2:08  
15 May 2018 11:47  22 May 2018  3:49  29 May 2018 14:19   6 Jun 2018 18:31  
13 Jun 2018 19:43  20 Jun 2018 10:50  28 Jun 2018  4:52   6 Jul 2018  7:50  
13 Jul 2018  2:48  19 Jul 2018 19:52  27 Jul 2018 20:20   4 Aug 2018 18:18  
11 Aug 2018  9:57  18 Aug 2018  7:48  26 Aug 2018 11:56   3 Sep 2018  2:37  
 9 Sep 2018 18:01  16 Sep 2018 23:15  25 Sep 2018  2:52   2 Oct 2018  9:45  
 9 Oct 2018  3:46  16 Oct 2018 18:01  24 Oct 2018 16:45  31 Oct 2018 16:40  
 7 Nov 2018 16:02  15 Nov 2018 14:54  23 Nov 2018  5:39  30 Nov 2018  0:19  
 7 Dec 2018  7:20  15 Dec 2018 11:49  22 Dec 2018 17:48  29 Dec 2018  9:34  
 6 Jan 2019  1:28  14 Jan 2019  6:45  21 Jan 2019  5:16  27 Jan 2019 21:10
stwf commented 6 years ago

ok, so it seems the new code had no effect. Just a quick check revealed the same output in phases and tables whether I used the new code or the old. Here is a sanity check screen shot showing that the output of phases is also off. I did do a make clean between running!

screen shot 2018-10-17 at 1 46 18 pm
Bill-Gray commented 6 years ago

Still baffled here. A couple of things to try, though. The first may fix the problem; if it fails, the second may give us a clue as to how to proceed. First, the possible fix : try downloading https://www.projectpluto.com/xfer/lunar2.cpp again and recompiling. I've cleaned it up quite a bit. There was some code that made sense when I first wrote it, back when pipelining processors didn't exist, branching didn't cause much of a penalty, and floating-point math was to be avoided at all costs. In 2018, however, those parts are now a load of rubbish. Quite aside from that, it contained some "youthful indiscretions", i.e., coding practices of the sort I now blush to recall. The new version is somewhat more logical, and more logical code is usually less buggy. It also includes a sanity assert check to ensure that the input binary data at least starts out with the correct byte sequence. Second, the diagnostic tool : download https://www.projectpluto.com/xfer/ltest.cpp, a brand-new test routine that dumps out some of the intermediate steps in computing lunar positions. Compile with g++ -Wall -O3 -Wextra -pedantic -o ltest ltest.cpp riseset3.cpp liblunar.a and run ./ltest. It will compute a sample position for mid-2018, showing some intermediate quantities. The output should be as follows. At this point, I really have no idea where things are going off the rails, but ltest ought to give us some ideas.

0: 5.809755
1: 4.053803
2: 3.095401
3: 3.783075
4: 3.589186
5: 2.515754
6: 2.739882
7: 1.182453
8: 0.185000
Lat: -2.037301
Lon: 328.746413
Dist: 403463.893740
stwf commented 6 years ago

OK, that other change to lunar2.cpp did not help, but ./ltest yielded

0: 5.809755
1: 4.053803
2: 3.095401
3: 3.783075
4: 3.589186
5: 2.515754
6: 2.739882
7: 1.182453
8: 0.185000
Lat: -1.756993
Lon: 328.746233
Dist: 400361.574973

Maybe this is helpful? The iterations look right? Its just that lat number is off, and the resulting distance?

Bill-Gray commented 6 years ago

AHA! (Well, at least I think it's an "aha".) I changed various references to char to unsigned char. And whaddaya know, I got the same results you did on the Pi. A little bit of Web searching reveals that chars are, by default, unsigned on ARM hardware. I've changed the relevant data to be explicitly signed char. If you download lunar2.cpp yet again, I think you'll find that it works this time. If there is still trouble, I'd try adding the flag -fsigned_char to the compile line and re-building. Probably a generally useful trick when porting code to the RaspPi: if it breaks, and -fsigned-char fixes it, you know what sort of problem to look for. (Still something I ought to fix by making sure that the chars in question are explicitly signed.)

stwf commented 6 years ago

aaaaaand... it works!! Thanks so much. My calculations are spot on now

I've been writing a wrapper to use your code in an elixir package and its going really well. I replaced some python code and things are running much smoother. I will let you know when its ready for prime time.

Bill-Gray commented 6 years ago

Excellent. Changes now added, committed, and pushed.