ia-web-commons is based on a quite old Hadoop version (0.20.2) which should be upgraded.
In production we already use a separate branch hadoop-3.2.2" (I know, branch names should not include a version number) which ia-hadoop-tools depend on.
Maybe it's time to upgrade master as well?
(plus) having a single branch is easier for users and deployment
(minus) this would bring ia-web-commons further apart from iipc/webarchive-commons
Detailed upgrades:
Hadoop 0.20.2-cdh3u4 -> 3.3.5
depend on the lean hadoop-client instead of hadoop-core to avoid dependency exclusions
use mainline vanilla Hadoop instead of Cloudera libs and remove the Cloudera repository from build configuration (note: the free "Cloudera Distribution of Hadoop (CDH)" was stopped in Oct 2019)
dependency upgrades (if also required by Hadoop, rely on this version)
ia-web-commons is based on a quite old Hadoop version (0.20.2) which should be upgraded.
In production we already use a separate branch hadoop-3.2.2" (I know, branch names should not include a version number) which ia-hadoop-tools depend on.
Maybe it's time to upgrade master as well?
Detailed upgrades: