FINRAOS / MegaSparkDiff

A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations of possible data sources. Multiple execution modes in multiple environments enable the user to generate a diff report as a Java/Scala-friendly DataFrame or as a file for future use. Comes with out of the box SparkFactory and SparkCompare tools.
https://finraos.github.io/MegaSparkDiff/
Apache License 2.0
49 stars 26 forks source link

Develop #66

Closed patshea closed 2 years ago

patshea commented 4 years ago

Added parallelizeCSVSource to load CSV data into DataFrames for comparison. Fixed typo in CountResult. Added enum for CSV source type.

blackduck-copilot[bot] commented 4 years ago

Black Duck Security Report

Merging #66 into develop will decrease security risk!

Added Components

High Risk: 1 Clean: 218

Removed Components

High Risk: 6 Medium Risk: 17 Low Risk: 2 Clean: 237

Click here to see full report