PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
In case if the output is not empty the schema can be deduced automatically, but if the output is empty (e.g. because all partitions are dropped by partition selection) there was an example "Empty RDD". Setting the schema fixes that.
In case if the output is not empty the schema can be deduced automatically, but if the output is empty (e.g. because all partitions are dropped by partition selection) there was an example "Empty RDD". Setting the schema fixes that.