chrthomsen / pygrametl

Official repository for pygrametl - ETL programming in Python
http://pygrametl.org
BSD 2-Clause "Simplified" License
289 stars 41 forks source link

BatchFactTable using Multirow can't handle None values #50

Closed camara-raf closed 2 years ago

camara-raf commented 2 years ago

Hi,

I'm trying to optimize a batch load using the multirow, and unfortunately the None values are converted to a string 'None' instead of 'NULL'. As this isn't any character on a string that needs to be sanitized, there is no method to address it from outside.

file:tables.py class: BatchFactTable parameter: usemultirow = True

current code can add single quotes to string values. Values that are not string, are converted to string without any special treatment to None: line:1872

lambda c: "'" + row[c].replace("'", "''") + "'"
                if type(row[c]) is str else str(row[c]), self.all)) + ')'

A suggestion would be to replace the value None with NULL when converting the value to string:

lambda c: "'" + row[c].replace("'", "''") + "'"
                if type(row[c]) is str else str(row[c]).replace('None','NULL'), self.all)) + ')'
skejserjensen commented 2 years ago

Hi @camara-raf ,

Thank you very much for opening this issue (it is indeed a bug) and for the suggested fix. The lambda should probably be moved to a proper function as it has become a bit complex. So I have created PR #51 which moves the lambda to a function and adds a case for None.

Can you check if PR #51 fixes your issue?

camara-raf commented 2 years ago

Hi @skejserjensen . Thank you for addressing this issue so quickly. Yes, PR #51 fixes my issue and the function is even better than the replace() method. Best Regards.

skejserjensen commented 2 years ago

Thank you for testing that the fix works @camara-raf, I will merge the PR and close the issue.