dstndstn / astrometry.net

Astrometry.net -- automatic recognition of astronomical images
http://astrometry.net
Other
661 stars 185 forks source link

astrometry-net issue: Scale range -inf to -inf is invalid #241

Closed iMichka closed 2 years ago

iMichka commented 2 years ago

Hello. I'm a maintainer of Homebrew, a package manager for macOS and Linux. We are trying to update wcslib from 7.6 to 7.7.

We have a small smoke test to check if astrometry-net still works:

image2pnm -h
build-astrometry-index -d 3 -o index-9918.fits -P 18 -S mag -B 0.1 -s 0 -r 1 -I 9918 -M -i /usr/local/Cellar/astrometry-net/0.85_1/examples/tycho2-mag6.fits
solve-field --config 99.cfg /usr/local/Cellar/astrometry-net/0.85_1/examples/apod4.jpg --continue --dir .

99.cfg contains:

add_path .
inparallel
index index-9918.fits

The last step fails:

2021-10-23T13:43:12.3826330Z �[34m==>�[0m �[1m/usr/local/Cellar/astrometry-net/0.85_2/bin/solve-field --config 99.cfg /usr/local/Cellar/astrometry-net/0.85_2/examples/apod4.jpg --continue --dir .�[0m
2021-10-23T13:43:12.3828200Z Reading input file 1 of 1: "/usr/local/Cellar/astrometry-net/0.85_2/examples/apod4.jpg"...
2021-10-23T13:43:12.3829590Z jpegtopnm: WRITING PPM FILE
2021-10-23T13:43:12.3830190Z Read file stdin: 719 x 507 pixels x 1 color(s); maxval 255
2021-10-23T13:43:12.3831320Z Using 8-bit output
2021-10-23T13:43:12.3831820Z Extracting sources...
2021-10-23T13:43:12.3832330Z simplexy: found 1467 sources.
2021-10-23T13:43:12.3832780Z Solving...
2021-10-23T13:43:12.3833220Z Reading file "./apod4.axy"...
2021-10-23T13:43:12.3834440Z Scale range -inf to -inf is invalid: min must be >= 0, max must be >= min.
2021-10-23T13:43:12.3835950Z engine-main.c:324:main: Failed to read job file "./apod4.axy"
2021-10-23T13:43:12.3838000Z solve-field.c:518:run_engine engine failed.  Command that failed was:
2021-10-23T13:43:12.3839600Z   /usr/local/Cellar/astrometry-net/0.85_2/bin/astrometry-engine --config 99.cfg ./apod4.axy
2021-10-23T13:43:12.3840570Z  ioutils.c:568:run_command_get_outputs Command failed: return value 255
2021-10-23T13:43:12.3841830Z �[31mError:�[0m astrometry-net: failed

Interestingly, this only fails on macOS 11 (ARM and Intel), but passes on macOS 12, and on Linux (ubuntu 16.04).

Could you help us debug this? I can reproduce this locally so I might be able to test things out if you need further feedback.

For reference: https://github.com/Homebrew/homebrew-core/pull/82101

dstndstn commented 2 years ago

That's weird! Are you sure it is related to the wcslib version bump?

Could you please add a "-v" to the solve-field run and send that log? And can you attach the apod4.axy file.

iMichka commented 2 years ago

Ha ha right I just tested again, it also fails with wcslib 7.6, so this is unrelated.

It's just that we built astrometry-net some time ago and it worked, and now that we are rebuilding astrometry-net from source again, it fails. Either a dependency has changed somewhere and we did not notice, or something changes on macOS 11 (the compiler for sure, but maybe other things).

Here is the full log:

solve-field -v --config 99.cfg /usr/local/Cellar/astrometry-net/0.85_1/examples/apod4.jpg --continue --dir .
Reading input file 1 of 1: "/usr/local/Cellar/astrometry-net/0.85_1/examples/apod4.jpg"...
Base: "./apod4", basefile "apod4.jpg", basedir ".", suffix "jpg"
Checking if file "/usr/local/Cellar/astrometry-net/0.85_1/examples/apod4.jpg" ext 0 is xylist or image: image
  (not xyls because: Failed to open FITS table /usr/local/Cellar/astrometry-net/0.85_1/examples/apod4.jpg: Failed to open FITS file "/usr/local/Cellar/astrometry-net/0.85_1/examples/apod4.jpg")
Running: /usr/local/bin/image2pnm --infile /usr/local/Cellar/astrometry-net/0.85_1/examples/apod4.jpg --uncompressed-outfile /tmp/tmp.uncompressed.nXUbc4 --outfile /tmp/tmp.ppm.lTLVFj --ppm --mydir /usr/local/bin/solve-field
jpegtopnm: WRITING PPM FILE
  Dirs: ['/usr/local/bin/solve-field', '/usr/local/bin', '/usr/local/Cellar/astrometry-net/0.85_1/libexec/lib/python3.9/site-packages/astrometry/util']
  jpg
Running: pnmfile /tmp/tmp.ppm.lTLVFj
Converting PPM image to FITS...
Running: ppmtopgm /tmp/tmp.ppm.lTLVFj | /usr/local/bin/an-pnmtofits > /tmp/tmp.fits.utMVKd
Read file stdin: 719 x 507 pixels x 1 color(s); maxval 255
Using 8-bit output
Extracting sources...
Running image2xy: input=/tmp/tmp.fits.utMVKd, output=/tmp/tmp.xyls.uKNjFr, ext=0
nhdus=1
Got naxis=2, na1=719, na2=507
simplexy: nx=719, ny=507
simplexy: dpsf=1.000000, plim=4.000000, dlim=1.000000, saddle=2.000000
simplexy: maxper=1000, maxnpeaks=100000, maxsize=2000, halfbox=100
simplexy: median smoothing...
simplexy: measuring image noise (sigma)...
Sampling sigma at 936 points
Nsigma=0.7, s=2.02031
simplexy: found sigma=2.02031.
simplexy: finding objects...
simplexy: found 145 blobs
simplexy: finding peaks...
Failed to find (5x5) centroid of peak 0, subpeak 88 at (524,408)
Failed to find (5x5) centroid of peak 0, subpeak 145 at (46,188)
Failed to find (5x5) centroid of peak 0, subpeak 173 at (48,186)
Failed to find (5x5) centroid of peak 0, subpeak 232 at (71,239)
Failed to find (5x5) centroid of peak 0, subpeak 425 at (51,30)
Failed to find (5x5) centroid of peak 0, subpeak 742 at (97,481)
Failed to find (5x5) centroid of peak 0, subpeak 792 at (162,300)
Failed to find (5x5) centroid of peak 0, subpeak 812 at (383,34)
Failed to find (5x5) centroid of peak 0, subpeak 865 at (20,223)
Failed to find (5x5) centroid of peak 0, subpeak 874 at (358,80)
Failed to find (5x5) centroid of peak 0, subpeak 875 at (397,119)
Failed to find (5x5) centroid of peak 0, subpeak 919 at (12,137)
Failed to find (5x5) centroid of peak 0, subpeak 939 at (116,237)
Failed to find (5x5) centroid of peak 0, subpeak 940 at (167,206)
Failed to find (5x5) centroid of peak 0, subpeak 945 at (449,311)
Failed to find (5x5) centroid of peak 0, subpeak 953 at (213,253)
Failed to find (5x5) centroid of peak 15, subpeak 1 at (7,99)
Failed to find (5x5) centroid of peak 23, subpeak 32 at (655,186)
Failed to find (5x5) centroid of peak 44, subpeak 1 at (250,272)
Failed to find (5x5) centroid of peak 45, subpeak 8 at (569,313)
Failed to find (5x5) centroid of peak 45, subpeak 9 at (529,287)
Failed to find (5x5) centroid of peak 63, subpeak 1 at (557,367)
Failed to find (5x5) centroid of peak 67, subpeak 13 at (669,389)
Failed to find (5x5) centroid of peak 99, subpeak 0 at (422,465)
Failed to find (5x5) centroid of peak 101, subpeak 4 at (323,500)
Failed to find (5x5) centroid of peak 113, subpeak 2 at (4,479)
simplexy: found 1467 sources.
Removing lines of (spurious) sources from xylist "/tmp/tmp.xyls.uKNjFr", writing to "/tmp/tmp.removelines.clKyAN"
Running: /usr/local/bin/removelines /tmp/tmp.xyls.uKNjFr /tmp/tmp.removelines.clKyAN
removelines.py: Removed 0 sources
Sorting file "/tmp/tmp.removelines.clKyAN" to "/tmp/tmp.sorted.LLOkQY" using columns flux (FLUX) and background (BACKGROUND), descending
Running: /usr/local/bin/uniformize -n 10 /tmp/tmp.sorted.LLOkQY /tmp/tmp.uniform.UbU5jC
Uniformizing into 4 x 2 bins
Image bounds: x [1.5199,718.354], y [1.83071,506.046]
Writing headers to file ./apod4.axy
Copying data block of file /tmp/tmp.uniform.UbU5jC to output ./apod4.axy.
Deleting temp file /tmp/tmp.uncompressed.nXUbc4
Deleting temp file /tmp/tmp.xyls.uKNjFr
Deleting temp file /tmp/tmp.removelines.clKyAN
Deleting temp file /tmp/tmp.sorted.LLOkQY
Deleting temp file /tmp/tmp.uniform.UbU5jC
Running: /usr/local/bin/plotxy -I /tmp/tmp.ppm.lTLVFj -i ./apod4.axy -C red -w 2 -N 50 -x 1 -y 1 -P | /usr/local/bin/plotxy -i ./apod4.axy -I - -w 2 -r 3 -C red -n 50 -N 200 -x 1 -y 1 > ./apod4-objs.png
Solving...
Running:
  /usr/local/bin/astrometry-engine --verbose --config 99.cfg ./apod4.axy
Trying index index-9918.fits...
Trying path ./index-9918.fits...
Index name "./index-9918.fits" is readable, using as index filename
Index name "./index-9918.fits" is readable, using as index filename
Index name "./index-9918.fits" is readable, using as index filename
Index name "./index-9918.fits" is readable, using as index filename
Index scale: [1000, 1400] arcmin, [60000, 84000] arcsec
Index has 3072 quads and 1917 stars
Reading file "./apod4.axy"...
Set odds ratio to solve to 1e+09 (log = 20.7233)
Scale range -inf to -inf is invalid: min must be >= 0, max must be >= min.
engine-main.c:324:main: Failed to read job file "./apod4.axy"
solve-field.c:518:run_engine engine failed.  Command that failed was:
  /usr/local/bin/astrometry-engine --verbose --config 99.cfg ./apod4.axy
 ioutils.c:568:run_command_get_outputs Command failed: return value 255

Here is the file: https://www.icloud.com/iclouddrive/0OwRxmC4u_3iYGOpYj-eUMM4w#apod4

dstndstn commented 2 years ago

Okay, I don't understand why it's doing that.

We're fetching a value from the header that isn't there, so it's supposed to return -inf, and then we handle the -inf. https://github.com/dstndstn/astrometry.net/blob/0.85/solver/engine.c#L818

If there's any way you can build with debug symbols and step through in gdb, that would be very helpful...

make clean make OPTIMIZE=no gdb --args solver/astrometry-engine --verbose --config 99.cfg ./apod4.axy (break 820) p lo p hi p dnil

(etc...)

Thanks!

iMichka commented 2 years ago

I was not able run gdb because I needed to codesign it on macOS. I had other things to finish first so I just picked up the subject again today. And funnily it does not fail locally anymore for me, so it's going to be hard to debug. The interesting part is that I did not change anything on my side, so it's weird that it is working now.

I compared the apod4.axy file I uploaded and the new one that was generated, and it's the same.

So the bug seems to be sort of transient. I'm trying to go back into a state where I could reproduce the error locally ...

iMichka commented 2 years ago

I used printf to print the values after line 820:

lo: -inf hi: -inf dnil: -inf

iMichka commented 2 years ago

lo == dnil -> 0

That's weird. Not a C expert though, so this might be expected / depend on the compiler?

dstndstn commented 2 years ago

+inf == +inf is required by IEEE-754, according to this https://stackoverflow.com/questions/41834621/c-ieee-floats-inf-equal-inf (the first answer gives a pointer to the chapter/verse in the standard)

dstndstn commented 2 years ago

Ohhhhhh, but do we set the "-ffinite-math-only" flag when compiling??

If so, can you please try doing a make clean; make FLAGS=-O2 or similar?

iMichka commented 2 years ago

-ffinite-math-only is passed, I don't think that's the issue.

I gave you a wrong information above. I checked the lo, hi values again: it's 73896, not -inf. I'm unsure what I did wrong when printing them out the first time. I'm not really fluent in C code, so it's a little bit tedious. What I did is printf("%d", hi); to get that value.

iMichka commented 2 years ago

Doh that's me using %d, which makes print it as an int.

But I could reproduce the issue with a minimal example:

test.c

#include <math.h>
#include <stdio.h>
#include <stdlib.h>

double qfits_header_getdouble(double errval)
{
    return errval;
}

int main() {
    double dnil = -HUGE_VAL;

    double result = qfits_header_getdouble(dnil);
    printf("%f\n", result);
    printf("%f\n", dnil);
    printf("%d\n", result == dnil);
    if (result == dnil)
        puts("result == dnil");
    if (-HUGE_VAL == dnil)
        puts("-HUGE_VAL == dnil");
    if (result == -HUGE_VAL)
        puts("result == -HUGE_VAL");
    if (-HUGE_VAL == -HUGE_VAL)
        puts("-HUGE_VAL == -HUGE_VAL");

    return EXIT_SUCCESS;
}
➜  test clang -o test test.c && ./test
-inf
-inf
1
result == dnil
-HUGE_VAL == dnil
result == -HUGE_VAL
-HUGE_VAL == -HUGE_VAL
➜  test clang -o test test.c -Os && ./test
-inf
-inf
1
result == dnil
-HUGE_VAL == dnil
result == -HUGE_VAL
-HUGE_VAL == -HUGE_VAL
➜  test clang -o test test.c -Os -ffinite-math-only && ./test
-inf
-inf
0
-HUGE_VAL == dnil
-HUGE_VAL == -HUGE_VAL

The last one is the one which does not work, and this is exactly what we do: we remove your -O3 to replace it with -Os during our build (because that's the standard on macOS with clang).

Then I tried a few more things:

➜  test clang -o test test.c -Os -ffinite-math-only && ./test
-inf
-inf
0
-HUGE_VAL == dnil
-HUGE_VAL == -HUGE_VAL
➜  test clang -o test test.c -O2 -ffinite-math-only && ./test
-inf
-inf
0
-HUGE_VAL == dnil
-HUGE_VAL == -HUGE_VAL
➜  test clang -o test test.c -O3 -ffinite-math-only && ./test
-inf
-inf
0
-HUGE_VAL == dnil
-HUGE_VAL == -HUGE_VAL
➜  test clang -o test test.c -ffinite-math-only && ./test
-inf
-inf
1
result == dnil
-HUGE_VAL == dnil
result == -HUGE_VAL
-HUGE_VAL == -HUGE_VAL

So basically as soon as you pass an -Ox flag, -ffinite-math-only is dismissed (for both clang and gcc) and it breaks the comparison logic.

dstndstn commented 2 years ago

Thanks for this testing! Let me cook up a fix. There's no real reason we need to use -inf as a flag value here. (But I also want to find other places where we might be doing something similar!)

dstndstn commented 2 years ago

fixed in c26551fc874056124ffa7aa0fea893e59b2fcb35

iMichka commented 2 years ago

Thanks :)