dlt-hub / dlt

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
https://dlthub.com/docs
Apache License 2.0
2.7k stars 180 forks source link

Salesforce column names not following naming convention correctly #2087

Open lsimon14 opened 2 days ago

lsimon14 commented 2 days ago

dlt version

1.4

Describe the problem

In version 1.3 and earlier, running a Salesforce pipeline to Postgres converted double underscores to single underscores for table column names, but does not in version 1.4. The Salesforce api naming conventions are to append c to all custom fields, when I run the pipeline in 1.4 it recreates all of the custom table columns as new columns. So example_custom_fieldc was previously loaded in as example_custom_field_c in 1.3, but was recreated as example_custom_field__c in 1.4.

Expected behavior

Salesforce destination column names replace double underscores with single underscores.

Steps to reproduce

Using the Salesforce verified source create a pipeline that pulls data from Salesforce, which includes a field whose API name has two underscores (any custom field will meet this criterion). Run the pipeline with Postgres as the destination and observe that the output column names retain the double underscore.

Operating system

macOS

Runtime environment

Local

Python version

3.10

dlt data source

Salesforce

dlt destination

Postgres

Other deployment details

No response

Additional information

No response