astropy / astroquery

Functions and classes to access online data resources. Maintainers: @keflavich and @bsipocz and @ceb8
http://astroquery.readthedocs.org/en/latest/
BSD 3-Clause "New" or "Revised" License
706 stars 400 forks source link

io.votable warnings #55

Closed taldcroft closed 11 years ago

taldcroft commented 11 years ago

All the warnings below from io.votable don't inspire confidence in the result and are likely to generate questions from users. Are these warnings a sign of non-compliant VO-tables from SIMBAD, or something else that can be fixed?

In [4]: tables = [astroquery.simbad.QueryId(x).execute().table for x in ['m31','m51','omc1']]
WARNING: W22: None:3:0: W22: The DEFINITIONS element is deprecated in VOTable 1.1.  Ignoring [astropy.io.votable.exceptions]
WARNING: W27: None:4:0: W27: COOSYS deprecated in VOTable 1.2 [astropy.io.votable.exceptions]
WARNING: W50: None:12:0: W50: Invalid unit string '"h:m:s"' [astropy.io.votable.exceptions]
WARNING: W50: None:15:0: W50: Invalid unit string '"d:m:s"' [astropy.io.votable.exceptions]
WARNING: W47: None:33:0: W47: Missing arraysize indicates length 1 [astropy.io.votable.exceptions]
WARNING: W47: None:36:0: W47: Missing arraysize indicates length 1 [astropy.io.votable.exceptions]
WARNING: W49: None:44:98: W49: Empty cell illegal for integer fields. [astropy.io.votable.exceptions]
WARNING: W46: None:44:121: W46: char value is too long for specified length of 1 [astropy.io.votable.exceptions]
astrofrog commented 11 years ago

Given that no services provide standard compliant tables (cc @mdboom) I think your first option is a likely hypothesis ;-) Can we ask the SIMBAD people to fix the tables?

mdboom commented 11 years ago

W46 is the most worrisome as it implies loss of data, but all of them appear to be legitimate.

keflavich commented 11 years ago

We'll need to clean up all these warnings by the first release... but if they're upstream issues, I'm not sure how to deal with them. @cdeil, this might be a good thing to work on.

mdboom commented 11 years ago

I would recommend this:

1) Report all of the warnings upstream to the tool that created them. Reference this page which describes what the various warnings mean and often point back to the relevant part of the spec.

2) If querying a particular server, in this case SIMBAD, always returns malformed data, you can temporarily turn off warnings when parsing the results. The catch_warnings context function in the stdlib is helpful for this. I would explicitly blacklist specific warnings, so that any we don't expect from SIMBAD won't creep through. And I'd put a note next to this to remove it in the future when upstream fixes their data.

3) I would recommend never hiding W47 or W46, since they do imply a loss of data, and the user should know that their fields have been truncated.

Background: I take a hard line on standard conformity when it comes to VOTable. That seems to be the point, right? If everyone produces data that requires special hacks and workarounds in the reader, it doesn't really make data exchange any easier, right?

keflavich commented 11 years ago

@mbdoom- thanks for the summary & suggestions. I agree that a standard only makes sense if enforced. I just didn't want the responsibility of enforcement =)

What we may do, though, is try to slim down the warnings to one big angry warning that "Data is truncated - see self.votable_warnings for details" if the missing data warnings persist and are too verbose.

keflavich commented 11 years ago

All votable warnings are suppressed by default. As we've found out over the past few months (and others have known for a long time), there are very few fully compliant VOTables out there: http://stsdas.stsci.edu/astrolib/vo_compliance_tests/ So we'll continue suppressing & reporting upstream as we go.