dataiku / dss-plugin-neo4j

A DSS Plugin to interact with the Neo4j graph database
Apache License 2.0
7 stars 1 forks source link

[Bug] String values "NA" are considered as N/A values #29

Open Cobra5197 opened 2 years ago

Cobra5197 commented 2 years ago

Hello,

"NA" string values are considered N/A values by the connector. Our column is composed of country iso_code_alpha2 and we used this column as primary key to create unique nodes but we got an error that one of the values is null and after a few hours of searching we found that was because of the Namibia alpha code.

Even string properties equal to "NA" are replaced with null values.

Regards, Cobra

SFuller4 commented 1 year ago

My team ran into this issue as well, after some digging, I've determined there to be missing functionality for the plugin. The create_dataframe_iterator function in the commons file is missing the ability to pass na_values and keep_default_na to the Dataset.iter_dataframes_forced_types. Which builds pandas dataframes using the pandas.read_table() function.

Since the dataiku library has the functionality to pass them, I've add the functionality to the plugin and pushed my PR #33 to solve it.