DataGenerator is a Java library for systematically producing large volumes of data. DataGenerator frames data production as a modeling problem, with a user providing a model of dependencies among variables and the library traversing the model to produce relevant data sets.
Hadoop is outdated. It's better not to use it now-days... Let's migrate all to spark.