Add support for `scala.math.BigInt`

DmytroMitin commented 2 years ago

https://stackoverflow.com/questions/74249859/spark-df-astype-fails-to-compile

vincenzobaz commented 2 years ago

Hello @DmytroMitin, thank you very much for this PR! Also amazing StackOverflow answer, it's great to spread knowledge about Spark and Scala 3

I am not very familiar with the spark type system unfortunately, so I have a question: why do you need to cases bigint and long? Are both Long and BigInt converted to the same underlying Spark datatype?

DmytroMitin commented 2 years ago

@vincenzobaz Well, I'm not an expert in Spark types either :) Unfortunately, it seems DecimalType codecs for BigInt are not always enough.

If a dataframe contains only small BigInts (fitting into long) then LongType is used https://gist.github.com/DmytroMitin/3c0fe6983a254b350ff9feedbb066bef and using DecimalType codecs is an error https://gist.github.com/DmytroMitin/ad77677072c1d8d5538c94cb428c8fa4
If a dataframe contains large BigInts (not fitting into long) then DecimalType is used https://gist.github.com/DmytroMitin/8124d2a4cd25c8488c00c5a32f244f64 and using LongType codecs is an error https://gist.github.com/DmytroMitin/3a3a61082fbfc12447f6e926fc45c7cd If a dataframe contains both small and large BigInts then DecimalType codecs are ok too https://gist.github.com/DmytroMitin/626e09a63a387e6ff1d7fe264fc14d6b

vincenzobaz commented 2 years ago

Thank you for the exhaustive explanation with examples, I hope it was not too painful to go through this

vincenzobaz commented 2 years ago

Since you seem to understand this topic well, If you are interested, I would be very interested in advice for #15

vincenzobaz / spark-scala3

Add support for `scala.math.BigInt` #22