A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations of possible data sources. Multiple execution modes in multiple environments enable the user to generate a diff report as a Java/Scala-friendly DataFrame or as a file for future use. Comes with out of the box SparkFactory and SparkCompare tools.
Modified SparkCompare.CompareSchemaDataFrames to not need or use tempViewName which makes it more convenient while using DataBricks Notebooks
Added a FullOUterJoin method in spark compare to make it easier to visualize differences while using IDE and console