Bertverbeek4PS / bc2adls

Exporting data from Dynamics 365 Business Central to Azure data lake storage or MS Fabric lakehouse
MIT License
49 stars 20 forks source link

Decimal data type #96

Closed lhammerbh closed 6 months ago

lhammerbh commented 6 months ago

Hi Bert

In the CopyBusinessCentral Notebook there is a column conversion taking place, where all the data types from the json files are converted to spark/lakehouse data types. In particular the Decimal data type is converted to float. Here is the code snippet.

       if col_type == "Decimal":
            col_type = "float"

The float data type comes out as a "real" data type in the SQL endpoint, which is basically also a float. This cause rounding errors in the Amounts, when we do not use a decimal data type, and per default it does not provide sufficient number of decimals.

Is it possible to use a decimal instead of a float?

I'm not great with Spark data types, but according to this, there should be a decimal data type: https://spark.apache.org/docs/latest/sql-ref-datatypes.html

Yours Lars Hammer

Bertverbeek4PS commented 6 months ago

@lhammerbh indeed for this I have choosen float. With decimaptype you have to choose how many decimals you want:

if col_type == "Decimal": col_type = DecimalType(10, 4)

So we need to descide what the decimal will be. Only issue is that if we are changing it it will be a breaking change. Or we need to set a parameter for this.

lhammerbh commented 6 months ago

Thanks for your concise answer. I am not great (yet) on Spark data types, so I did not know that we needed to set the number of decimals. So I think you made a good compromise on setting it to a float. When using the SQL Endpoint, we usually prefer the SQL Server Decimal type for amounts, and that was my reason for opening this issue. I'll close this issue.

Bertverbeek4PS commented 6 months ago

@lhammerbh well I can uptake a setting where you can choose if you want a float or a decimal and specify the number of decimals. Would that sufficient enough?

lhammerbh commented 6 months ago

That will definitely be great. Usually we would use 5 or 10 decimals and 2 decimals for the users to view. Thanks.

Bertverbeek4PS commented 6 months ago

@lhammerbh I have created a new branche with the parameter. Can you test this? https://github.com/Bertverbeek4PS/bc2adls/blob/decimal-value-notebook-parameter/fabric/CopyBusinessCentral.ipynb

lhammerbh commented 6 months ago

Yes, Birthe and I will be happy to test this. We will get back to you, when tested.

lhammerbh commented 6 months ago

Thanks, @Bertverbeek4PS This is fully approved. One learning from this is that there is a noticeable difference between Decimal(,) and Float. Using a float, empty values will be 0 Using a decimal, empty values will be null.