slingdata-io / sling-cli

Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.
https://docs.slingdata.io
GNU General Public License v3.0
398 stars 27 forks source link

Split `date` & `timestamp` Data Types #210

Closed fabrice-etanchaud closed 6 months ago

fabrice-etanchaud commented 6 months ago

Hi ! Testing 1.1.12, I found that native date columns are changed to general datetime columns leading to a timestamp in the target db :

https://github.com/slingdata-io/sling-cli/blob/2e4babad75071702717a963378132b23c5061263/core/dbio/database/templates/types_native_to_general.tsv#L90

Could they be mapped to date ?

Thank you !

flarco commented 6 months ago

Yes, it was designed this way for simplicity, but it's a good time to look into splitting dates & timestamps.

flarco commented 6 months ago

Done. v1.1.14 should map dates correctly. Closing.

fabrice-etanchaud commented 6 months ago

Thank you @flarco, it's a great chance to see that sling is such an active project !

TylerTollefson commented 5 months ago

Looks like this change introduced an issue when converting oracle date it is now being mapped as a generic type date, however Oracle "date" do include time components. So when an Oracle date is being mapped to bigquery date, that time component is lost...

flarco commented 5 months ago

Interesting. So, in oracle date is the same as datetime? For now, you can overrride the type with source_options.colums: { "date_col": datetime }

TylerTollefson commented 5 months ago

Correct, which is why in the patch before v1.1.14 the mapping of the types worked. The override should work for now though, thanks!

flarco commented 5 months ago

Thanks, will fix.

TylerTollefson commented 5 months ago

sling-cli/core/dbio/database/templates/types_native_to_general.tsv

Just wanted to circle back on this after I've had some more time to test in my env, the issue here is that there is no mapping for Oracle datetime anymore. When the override suggested above was in my replication yaml it failed the migration into BigQuery because it couldn't coerce from timestamp to datetime. Removing the override allows the migration to finish but we end up losing the time part of the datetime type.

flarco commented 5 months ago

Hey, are you using the latest version? The date type is mapped to datetime (here), so resolution should not be lost.

TylerTollefson commented 5 months ago

Theres no mapping for Oracle's datetime type, that what seems to be missing. If I understand correctly, we would need to have a datetime -> datetime for Oracle to BigQuery.

flarco commented 5 months ago

Interesting, I don't see a datetime data type in the docs... is this a new thing? But correct, it needs to be added in there. I just added it for next release: https://github.com/slingdata-io/sling-cli/commit/6a5af65dd38ea3ded8dea4e73fd49b7c0a0bc08b Do you mind compiling the binary in branch 1.2.7 and testing it out?

TylerTollefson commented 5 months ago

Sure, I'll give that a try and let you know.