issues
search
awslabs
/
deequ
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache License 2.0
3.32k
stars
539
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Replace Spark SQL isNull check with Spark Scala based DSL
#493
rdsharma26
closed
1 year ago
0
Updated the Categorical range constraint suggestions to use a new class called ConstraintSuggestionWithValue
#492
rdsharma26
closed
1 year ago
0
Added Uniqueness constraint suggestion to the list of EXTENDED suggestions
#491
rdsharma26
closed
1 year ago
0
Enhanced constraint suggestions
#490
rdsharma26
closed
1 year ago
0
Addition of HasMax/HasMin/HasStandardDeviation/HasMean constraint suggestions
#489
rdsharma26
closed
1 year ago
0
Adding the custom constraints
#488
DivyangPatelIITD
opened
1 year ago
1
Incremental profiling to be merged with older result
#487
nihal-laliwala-a
opened
1 year ago
0
org.apache.spark.SparkException: Task not serializable
#486
sudhakaru
closed
4 months ago
4
Improve message wording for dataset comparison
#485
mentekid
closed
1 year ago
0
Issue 462 Fix
#484
samarth-c1
closed
1 year ago
1
what are the current project perquisites for building and installing Deequ locally?
#483
jasonhorner
opened
1 year ago
0
Fix chi-square test conditions
#482
bevhanno
closed
1 year ago
1
Update Deequ to Spark 3.4
#481
jklap
closed
1 year ago
2
Add population stability index (PSI) to distance methods
#480
bevhanno
closed
1 year ago
1
issue 462 fix - updated the escape character replacement from ' to \
#479
samarth-c1
closed
1 year ago
1
Missing Column Precondition for Compliance Check - issue fix 467
#478
samarth-c1
closed
1 year ago
4
Filter or quarantine data based on row-level checks
#477
riccardodelega
closed
1 year ago
7
Added ColumnValues to Row-level Results
#476
zixianzh1
closed
1 year ago
1
Alternative aggregate functions to calculate histogram values.
#475
akalotkin
closed
1 year ago
5
Fix typo Mutlicolumn -> Multicolumn
#474
eycho-am
closed
1 year ago
0
[Experimental] Added a function to the Data Synchronization utility for annotating the dataframe with row level results
#473
rdsharma26
closed
1 year ago
0
[BUG] Serializing FullColumn field in Metric fails for Uniqueness
#472
eycho-am
opened
1 year ago
0
Feature/uniqueness row level results
#471
eycho-am
closed
1 year ago
2
Bump spark-core_2.12 from 3.3.0 to 3.4.0
#469
dependabot[bot]
closed
9 months ago
2
Feature: ColumnValues to Row-level Results
#468
zixianzh1
closed
1 year ago
0
Execution failure crosstalk between different checks in a suite
#467
marcantony
closed
1 year ago
2
New API added to referential integrity to allow for row level annotation
#466
rdsharma26
closed
1 year ago
2
Feature: Length Row Level Results
#465
eycho-am
closed
1 year ago
0
Update issue templates
#464
eycho-am
closed
1 year ago
0
Updated Referential Integrity to support multiple columns
#463
rdsharma26
closed
1 year ago
0
Check isContainedIn does not recognize string in quotes as allowed value
#462
markushc
closed
1 year ago
2
Bugfix/escape columnname
#461
eycho-am
closed
1 year ago
2
Bugfix: Escape Column Name with backticks to allow names with "."
#460
eycho-am
closed
1 year ago
1
Bugfix: Small perf optimization for constraints using histogram
#459
mentekid
closed
1 year ago
0
[Bugfix] Improve histogram performance
#458
mentekid
closed
1 year ago
0
Update pom.xml to publish to https://aws.oss.sonatype.org/
#457
eycho-am
closed
1 year ago
0
DQDL example for Time-series
#456
jinman
closed
1 year ago
1
Update version to 2.0.3-spark-3.3
#455
eycho-am
closed
1 year ago
0
Update version to 2.0.3-spark-3.3
#454
eycho-am
closed
1 year ago
0
Fix style issues causing mvn install to fail.
#453
rdsharma26
closed
1 year ago
0
Feature: Row Level Results
#452
mentekid
closed
1 year ago
0
Feature: Row Level Results
#451
mentekid
closed
1 year ago
2
adding to the committer list
#450
haojiliu
opened
1 year ago
7
[Experimental] Addition of dataset comparison utilities
#449
rdsharma26
closed
1 year ago
0
Empty commit for testing build action
#448
rdsharma26
closed
1 year ago
0
Add Github build action
#447
rdsharma26
closed
1 year ago
0
Adding custom RestMetricsRepository to release 2.0.0-spark-3.1
#446
cmachgodaddy
closed
1 year ago
0
Adding new MetricRepository to read and write data via REST
#445
cmachgodaddy
opened
1 year ago
4
Adding chi-square distance method for categorical variables
#444
bevhanno
closed
1 year ago
4
test
#443
donglin1234
closed
1 year ago
1
Previous
Next