django-daiquiri / daiquiri

A framework for the publication of scientific databases
https://escience.aip.de/daiquiri
Apache License 2.0
26 stars 8 forks source link

Empty values are not correctly transmitted #49

Closed olebole closed 3 years ago

olebole commented 4 years ago

This issue may or may not related to #48:

When one queries a table column that has empty values, it seems that the invalid value 'none' is transferred:

import pyvo
x_url = 'https://gaia.aip.de/tap'
qry = 'SELECT top 10 priam_flags FROM gdr2.gaia_source'
tap_service = pyvo.dal.TAPService(x_url)
tap_service.run_sync(qry)

gives a stack trace that ends in

DALFormatError: ValueError: None:20:28: \
   ValueError: invalid literal for int() with base 10: 'none' (in row 1, col 'priam_flags')

For me it is a bit unclear what should be transmitted for empty cells, but the string 'none' makes it impossible to query the column with PyVO. It seems that an empty string would fit better, see

https://github.com/astropy/astropy/blob/a0f5940bd00ff7259ed6b2391bfddecd007a2a43/astropy/io/votable/converters.py#L821-L842

            if value == '':
                if config['version_1_3_or_later']:
                    mask = True
                else:
                    warn_or_raise(W49, W49, (), config, pos)
                if self.null is not None:
                    value = self.null
                else:
                    value = self.default
            elif value == 'nan':
                mask = True
                if self.null is None:
                    warn_or_raise(W31, W31, (), config, pos)
                    value = self.default
                else:
                    value = self.null
            elif value.startswith('0x'):
                value = int(value[2:], 16)
            else:
                value = int(value, 10)

I however didn't find where this is actually set in Daiquiri.

jochenklar commented 4 years ago

I tested it locally, and it works fine with proper NULL values in the database. Maybe the data in gdr2.gaia_source is not using NULL but something weird.

olebole commented 4 years ago

@galkinAIP @harenk can you check that?

harenk commented 4 years ago

. so we need to provide NULL values for each empty column in a row?  This means we have to change the database table creation, using "DEFAULT NULL" always when no special "NOT NULL" required?  As of now, no table in alma083 database has this. Before I go and change, I'd like to test on tables, which are not yet published, to have no unintended consequences cropping up

Ole Streicher wrote:

@galkinAIP https://github.com/galkinAIP @harenk https://github.com/harenk can you check that?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/django-daiquiri/daiquiri/issues/49#issuecomment-628048222, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIC32XK4QAXGG3BEI7GAWDRRKYS7ANCNFSM4M54XZTA.

--


jochenklar commented 4 years ago

Sorry, now I can reproduce it (it fails for int columns). I will try to fix it.

jochenklar commented 4 years ago

Ok fixed in 7ca6d11302a9534b3f23d69cd12dadffada76173. It was only affecting tap/sync since in this case pg_dump is not involved, so that NULL values are not 'NULL' strings, but Python None.