Closed henrydavidge closed 3 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 92.36%. Comparing base (
4f9d314
) to head (6a2ef4f
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
…compact
What changes are proposed in this pull request?
As reported in #537, the
split_mutliallelics
transformer splits INFO fields in an unexpected way for unbounded info fields. After this PR, we:A
(alternate alleles)In addition, I replaced the looped calls to
withColumn
with batched calls towithColumns
. CallingwithColumn
many times is not recommended as it can result in very large plans: https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.DataFrame.withColumn.htmlHow is this patch tested?
(Describe any other testing)
To run Spark 4.0 tests, add
[SPARK4]
to the pull request title.