astropy / pyvo

An Astropy affiliated package providing access to remote data and services of the Virtual Observatory (VO) using Python.
https://pyvo.readthedocs.io/en/latest
BSD 3-Clause "New" or "Revised" License
74 stars 50 forks source link

PyVO dateobs attribute not handling NaNs #549

Closed duytnguyendtn closed 1 month ago

duytnguyendtn commented 1 month ago

Hi PyVO maintainers,

I think I discovered that the CANDELS archive is returning a numpy nan for its "VOX:Image_MJDateObs", causing an error when trying to obtain the dateobs attribute on SIAResults:

from astropy.coordinates import SkyCoord
from astropy import units as u
from pyvo import registry

sia_service = registry.search(registry.Servicetype("sia"), registry.Waveband("optical"))['CANDELS'].get_service(service_type="sia")
coord = SkyCoord("53.15995 -27.78437", unit="deg")
rad = 1
sia_results = sia_service.search(coord,
                                 size=((rad * u.deg) if rad > 0 else None),
                                 format='image/fits')
sia_results[0].dateobs

returns the following traceback:

Traceback ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\core.py:635, in TimeBase._get_time_fmt(self, val, val2, format, scale, precision, in_subfmt, out_subfmt, mask) 634 val, val2 = cls._fill_masked_values(oval, oval2, mask, in_subfmt) --> 635 return cls(val, val2, scale, precision, in_subfmt, out_subfmt) 636 except UnitConversionError: File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\formats.py:169, in TimeFormat.__init__(self, val1, val2, scale, precision, in_subfmt, out_subfmt, from_jd) 168 else: --> 169 val1, val2 = self._check_val_type(val1, val2) 170 self.set_jds(val1, val2) File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\formats.py:535, in TimeNumeric._check_val_type(self, val1, val2) 534 if val1.dtype.kind == "f": --> 535 val1, val2 = super()._check_val_type(val1, val2) 536 elif not orig_val2_is_none or not ( 537 val1.dtype.kind in "US" 538 or ( (...) 541 ) 542 ): File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\formats.py:325, in TimeFormat._check_val_type(self, val1, val2) 324 if self.__class__._check_finite: --> 325 self._check_finite_vals(val1, val2) 327 if getattr(val1, "unit", None) is not None: 328 # Convert any quantity-likes to days first, attempting to be 329 # careful with the conversion, so that, e.g., large numbers of 330 # seconds get converted without losing precision because 331 # 1/86400 is not exactly representable as a float. File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\formats.py:318, in TimeFormat._check_finite_vals(self, val1, val2) 317 if not (ok1 and ok2): --> 318 raise TypeError( 319 f"Input values for {self.name} class must be finite doubles" 320 ) TypeError: Input values for mjd class must be finite doubles The above exception was the direct cause of the following exception: ValueError Traceback (most recent call last) Cell In[3], line 13 9 rad = 1 10 sia_results = sia_service.search(coord, 11 size=((rad * u.deg) if rad > 0 else None), 12 format='image/fits') ---> 13 sia_results[0].dateobs File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\pyvo\dal\sia.py:706, in SIARecord.dateobs(self) 704 dateobs = self.getbyucd("VOX:Image_MJDateObs") 705 if dateobs: --> 706 return Time(dateobs, format="mjd") 707 else: 708 return None File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\core.py:2021, in Time.__init__(self, val, val2, format, scale, precision, in_subfmt, out_subfmt, location, copy) 2019 self._set_scale(scale) 2020 else: -> 2021 self._init_from_vals( 2022 val, val2, format, scale, copy, precision, in_subfmt, out_subfmt 2023 ) 2024 self.SCALES = TIME_TYPES[self.scale] 2026 if self.location is not None and ( 2027 self.location.size > 1 and self.location.shape != self.shape 2028 ): File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\core.py:559, in TimeBase._init_from_vals(self, val, val2, format, scale, copy, precision, in_subfmt, out_subfmt) 556 mask, val_data2 = get_mask_and_data(mask, val2) 558 # Parse / convert input values into internal jd1, jd2 based on format --> 559 self._time = self._get_time_fmt( 560 val_data, val_data2, format, scale, precision, in_subfmt, out_subfmt, mask 561 ) 562 self._format = self._time.name 564 # Hack from #9969 to allow passing the location value that has been 565 # collected by the TimeAstropyTime format class up to the Time level. 566 # TODO: find a nicer way. File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\astropy\time\core.py:643, in TimeBase._get_time_fmt(self, val, val2, format, scale, precision, in_subfmt, out_subfmt, mask) 638 except (ValueError, TypeError) as err: 639 # If ``format`` specified then there is only one possibility, so raise 640 # immediately and include the upstream exception message to make it 641 # easier for user to see what is wrong. 642 if len(formats) == 1: --> 643 raise ValueError( 644 f"Input values did not match the format class {format}:" 645 + os.linesep 646 + f"{err.__class__.__name__}: {err}" 647 ) from err 648 else: 649 problems[name] = err ValueError: Input values did not match the format class mjd: TypeError: Input values for mjd class must be finite doubles ```

of particular importance:

File E:\HEASARC\gitrepos\jdaviz\envlocal\lib\site-packages\pyvo\dal\sia.py:706, in SIARecord.dateobs(self)
    704 dateobs = self.getbyucd("VOX:Image_MJDateObs")
    705 if dateobs:
--> 706     return Time(dateobs, format="mjd")
    707 else:
    708     return None

Should pyvo be doing a nan check before feeding to the Astropy Time class?

duytnguyendtn commented 1 month ago

Found by @camipacifici, FYI @trjaffe

bsipocz commented 1 month ago

I think we should propagate it upstream and see if a masked return is possible from Time, but for an immediate workaround your suggestion will work here.

ManonMarchand commented 1 month ago

In can reproduce and confirm:

import numpy as np
np.isnan(sia_results.getrecord(0).getbyucd("VOX:Image_MJDateObs"))

After your example gives True. Thanks for spotting that.

Do you want to contribute your suggestion?

duytnguyendtn commented 1 month ago

Absolutely! Should be as simple as this, right? https://github.com/astropy/pyvo/pull/550/commits/1f99832020b9c46b5e924f3fd1bc18c5e4d08e95

duytnguyendtn commented 1 month ago

For some additional background here, you can see the full output on the CANDELS service using the query: https://archive.stsci.edu/siap/search.php?id=candels&pos=53.15995,-27.78437

For posterity (in the event this gets patched), here is a snippet of the returned VOTable that ultimately resulted in the nan:

<DATA>
<TABLEDATA>
<TR>
<TD>hlsp_candels_hst_acs_gsw05-sect23_f814w_v0.5_drz.fits</TD>
<TD>MAST:HST:ACS</TD>
<TD>53.1459921976</TD>
<TD>-27.7863364</TD>
<TD>
<![CDATA[ http://archive.stsci.edu/pub/hlsp/candels/goods-s/gsw05/v0.5/hlsp_candels_hst_acs_gsw05-sect23_f814w_v0.5_drz.fits ]]>
</TD>
<TD>1049760000</TD>
<TD/> <!-- EMPTY DATETIME HERE -->
<TD>2</TD>
<TD>16200 16200</TD>
...

20240522_CANDELS_MJDMeanNan.txt