MrPowers / bebe

Filling in the Spark function gaps across APIs
50 stars 5 forks source link

More column types #15

Closed alfonsorr closed 3 years ago

alfonsorr commented 3 years ago

Restructuration of the project, to make it a single import project, just import mrpowers.bebe._ and all the needed elements will be imported.

Updated SBT to latest version.

Added String and Boolean columns

Added syntax for basic common functionality for columns, like typed Equality

Type class for literal conversion, this will allow passing as literals all compatible scala types to a typed column.

Numeric logic like plus, minus, bigger than, etc, for another column of the same type and literals. (Compile bug when adding to an IntegerColumn a literal 🤦 , but the rest of operations work....) Example:

    val df = Seq(
      (2, true, 1, 4),
      (3, true, 2, 6),
      (4, false, 3, 8)
    ).toDF("some_data", "expected_result", "expected_minus", "expected_double")
      .withColumn("transformed")(_.get[IntegerColumn]("some_data") <= 3)
      .withColumn("minus")(_.get[IntegerColumn]("some_data") - 1)
      .withColumn("plus_double")(df => df.get[IntegerColumn]("some_data") + df.get[IntegerColumn]("some_data"))
      .withColumn("mult_double")(df => df.get[IntegerColumn]("some_data") * 2)

Next PR will add more types like Float, Bigint etc Also a type class for typesafe casting of columns.

alfonsorr commented 3 years ago

Just find a workarround to skip the error adding literals here import Predef.{any2stringadd => _, _}

MrPowers commented 3 years ago

Great refactoring!!! Thank you!