Support sequences + most types used in spark

michael72 commented 1 year ago

Hello @vincenzobaz

I wanted to evaluate use of Scala 3 in our project that uses spark and came across your project and found it very useful. However code generation did not work entirely, since we also rely on collections in our case classes used in spark. Hence I've tried my best to add most of the datatypes that are supported in spark sql.

Maybe you can have a look at the PR - sorry it is quite big. I also added sbt-scalafmt which makes it possible to use scalafmtAll and scalafmtSbt inside sbt.

There is one long-ish and quirky matching expression at the end of Serializer and Deserializer in summonAll which makes it possible now to encode and decoder very long data types. I know that it is generally possible to do so when increasing "-Xmax-inlines:256" for example - but with this "bulk-processing" of the tuples that I added: I don't need that setting in our project and it doesn't crash the compiler with a stackoverflow. I tried to fix that stackoverflow but it seems the best way to do so is to restrict the stack size of that summonAll expression otherwise the compiler will crash occasionally in our project, especially when clean building.

I also added myself as a developer - I hope that is OK.

I also hope that this PR is OK - or give me your thoughts and please review. Thanks!

Regards Michael

btw - this is the 2nd try for the PR - this time with the correct branches (I hope)

vincenzobaz commented 1 year ago

Hi @michael72, thank you for the PR and I am glad you found the lib useful. I had tried myself to work on supporting collections through https://github.com/vincenzobaz/spark-scala3/pull/15 but life got in the way and I have not had a lot of time to dedicate to this project.

I also added sbt-scalafmt

Thank you for this! It is always useful to have formatting.

or give me your thoughts and please review

It is the least I could do for such a great contribution!

michael72 commented 1 year ago

Hey @vincenzobaz,

Sooo.... are we gonna merge this? :-)

I could maybe try to work on Enums or on udf (or replacement of udf-call) next (in time). That could be tricky but it could probably be done.

As discussed, if you like, you could add me as a maintainer

michael72 commented 1 year ago

fixes #14

vincenzobaz commented 1 year ago

@michael72 I let you do the honor of merging!

I could maybe try to work on Enums or on udf (or replacement of udf-call) next (in time). That could be tricky but it could probably be done.

I would say udf have higher priority, but I let you focus on what is more fun for you!

michael72 commented 1 year ago

I let you do the honor of merging!

@vincenzobaz that's nice - unfortunately I do not have the rights to do so - I can only close the PR but not merge it

vincenzobaz commented 1 year ago

I let you do the honor of merging!

@vincenzobaz that's nice - unfortunately I do not have the rights to do so - I can only close the PR but not merge it

I am sorry, I thought the approval is enough. I will look into how to adding a contributor to the repo :smile: . I am not that familiar with this part of github

vincenzobaz / spark-scala3

Support sequences + most types used in spark #26