Open DmytroMitin opened 2 years ago
Hello @DmytroMitin, thank you very much for this PR! Also amazing StackOverflow answer, it's great to spread knowledge about Spark and Scala 3
I am not very familiar with the spark type system unfortunately, so I have a question: why do you need to cases bigint
and long
?
Are both Long
and BigInt
converted to the same underlying Spark datatype?
@vincenzobaz
Well, I'm not an expert in Spark types either :)
Unfortunately, it seems DecimalType
codecs for BigInt
are not always enough.
If a dataframe contains only small BigInt
s (fitting into long
) then LongType
is used https://gist.github.com/DmytroMitin/3c0fe6983a254b350ff9feedbb066bef and using DecimalType
codecs is an error https://gist.github.com/DmytroMitin/ad77677072c1d8d5538c94cb428c8fa4
If a dataframe contains large BigInt
s (not fitting into long
) then DecimalType
is used https://gist.github.com/DmytroMitin/8124d2a4cd25c8488c00c5a32f244f64 and using LongType
codecs is an error https://gist.github.com/DmytroMitin/3a3a61082fbfc12447f6e926fc45c7cd If a dataframe contains both small and large BigInt
s then DecimalType
codecs are ok too https://gist.github.com/DmytroMitin/626e09a63a387e6ff1d7fe264fc14d6b
Thank you for the exhaustive explanation with examples, I hope it was not too painful to go through this
Since you seem to understand this topic well, If you are interested, I would be very interested in advice for #15
https://stackoverflow.com/questions/74249859/spark-df-astype-fails-to-compile