... this causes a bit of trouble for the usage of the Dataset API as datasets which try to obtain the license from the CC site can no longer be materialized.
Two remedies:
mark files that do not directly belong to the dataset and come from other sources than the dataset as optional - so the materialization does not fail if they cannot be obtained (or if they fail validation)
keep copies of common license files in the Dataset JAR so we can refer to them locally and do not have to rely on e.g. the CC site to provide them
... this causes a bit of trouble for the usage of the Dataset API as datasets which try to obtain the license from the CC site can no longer be materialized.
Two remedies: