Segfault-Inc / Multicorn

Data Access Library
https://multicorn.org/
PostgreSQL License
700 stars 145 forks source link

handling of Null fields with custom fdw. #102

Closed oscardssmith closed 9 years ago

oscardssmith commented 9 years ago

I am currently implementing a custom data wrapper for a flat database structure. This is a custom database that uses 0000/00/00 as a default date instead of SQL's NULL. This is an example of what one row looks like when pulled and formatted as an array. ['10', '589', '589', '1995/09/10', '0000/00/00', '2011/07/26', '2012/07/31', '2014/03/31', '0000/00/00', '0000/00/00', '1', '0', '0', '-1', '-1', '0', '0', '0000/00/00 00:00:00', '0000/00/00 00:00:00', 'Wands', 'Eric', 'Manuel', 'Magic', 'Jr.', '', '', '', '02469', '4', '0', '0', '0', '0', '0', '0', '2', '0', '0', '0', '0', '', '', '', '', '', '', '', ''] When I feed this to sql, I get the message ERROR: date/time field value out of range: "0000/00/00". How can I tell SQL that these dates should be NULL? I've tried changing them to "", "NULL", NONE as well as a few others. Thanks

rdunklau commented 9 years ago

I don't think this is a problem with Multicorn itself, but you should be able to detect them in your foreign data wrapper and replace them by a Python None value, which will be converted to NULL by Multicorn.

oscardssmith commented 9 years ago

When I set it to none using this code "for field in text: if field== "0000/00/00" or field == "0000/00/00 00:00:00": newtext.append(None) else: newtext.append(field) yield newtext" I get a different error, ERROR: invalid input syntax for type date: ""

oscardssmith commented 9 years ago

Is there any way to make this indent correctly?

rdunklau commented 9 years ago

Ok I understand, it is a bug in the (not frequently used) code which transforms python sequences to postgres tuples.

Basically, you can return either sequences or dict for each line, in the execute method. The recommended approach is to use dict.

I'll get a fix for the sequence API, in the meantime you could try to use a dictionary instead, ie:

if field== "0000/00/00" or field == "0000/00/00 00:00:00":
    newtext[my_column_name] = None

(for information with github markdown, please look here: https://guides.github.com/features/mastering-markdown/)

oscardssmith commented 9 years ago

Ok thanks. I worked for way to long trying to get that to work.

rdunklau commented 9 years ago

Further testing reveals that it should indeed work as intented. Is it possible for you to paste your foreign data wrapper's code ?

oscardssmith commented 9 years ago

import subprocess import cStringIO from multicorn import ForeignDataWrapper from .utils import log_to_postgres from logging import WARNING

class PatientsFdw(ForeignDataWrapper): """A foreign data wrapper for accessing partner databases. Valid options: there aren't any """

def __init__(self, fdw_options, fdw_columns):
    super(PatientsFdw, self).__init__(fdw_options, fdw_columns)
    self.columns = fdw_columns

def execute(self, quals, columns):
    fields="pcc cus acc born deceased lastphy nextphy lastvis lastmiss lastcan tot missed cancel missreason canreason referral carecenter missts cants last first middle nick suffix mcaidno mcareno mrecno chartno reln provider physical ethnicity race[0] race[1] race[2] sex lang[0] lang[1] lang[2] contactpref contact addr1 addr2 city state zip phone email"
    fetchcmd="/PCC/partner/etc/ppull -N -s7 -S3 patients "+ fields
    f=cStringIO.StringIO(subprocess.check_output(fetchcmd, shell=True))
        for line in f:
            text=line.split('^')
            newtext=[]
            for field in text:
                if field== "0000/00/00" or field == "0000/00/00 00:00:00":
                    newtext.append(None)
                else:
                    newtext.append(field)
            fline=dict(zip(fields.split(), newtext))
            yield fline
        f.close()
rdunklau commented 9 years ago

Thank for reporting this issue !

oscardssmith commented 9 years ago

Thank you for fixing it and also getting me to use dict which is better anyway!