Closed ruebot closed 4 years ago
Merging #463 into master will decrease coverage by
0.05%
. The diff coverage is94.54%
.
@@ Coverage Diff @@
## master #463 +/- ##
==========================================
- Coverage 76.49% 76.43% -0.06%
==========================================
Files 49 50 +1
Lines 1459 1460 +1
Branches 279 279
==========================================
Hits 1116 1116
- Misses 213 214 +1
Partials 130 130
Yeah, I could clean that notebook up, and toss it in https://github.com/archivesunleashed/notebooks when we're done.
GitHub issue(s):
408
409
410
What does this Pull Request do?
Implement Scala Matchbox UDFs in Python.
How should this be tested?
Additional Notes:
I made a number of structural changes to the Scala side. @lintool, please let me know if you take strong issue with anything.
I'm going to punt on the
hasX
filters for right now, and loop back around to them. I hit a wall with trying to get them to run in PySpark, and part of me is tempted to just say that we should go with the natural PySpark (Python) implementation here. Basically:or
Instead of
or
Basically, an argument I made in #425.