Limit max memory for ExternalSorter and BufferedExternalSorter to 2047 MB to prevent int overflow within Hadoop's sorting library
Fix int overflow for large memory values in InMemorySorter
Add note about estimated disk use to README.MD
Fix to make Hadoop's sorting library put all temp files under the specified directory
Have Hadoop clean up the temp directory on exit
Stop shading hadoop dependencies. Some context:
The existing shading is broken (modules that depend on this one cannot use it successfully).
Hadoop's use of reflection in several instances makes shading the dependency "in a good way" nearly impossible. It requires a couple of rather brittle hacks, and, for clients that depend on certain conflicting versions of hadoop these hacks can mean it doesn't meet its intended goal of preventing conflicts anyway.
From what I can tell, there's no good way to shade this to make it universally usable, so leaving it unshaded seems like a reasonable default.
Without shading Hadoop, this module can be successfully used from Beam's wordcount example (which actually does have pre-existing hadoop dependencies already).
Includes: