Current TableBuilder is nice but it only allows creating Tables from text or from binary Pangool Tuple files. It is not possible to use it with arbitrary InputFormats. As we are currently working in making Splout integrate easily with the Hadoop eco-system (Hive, Pig, Cascading) we should make TableBuilder more flexible and also add the possibility of implicit Schemas, so that if no Schema is provided it is read from the Tuples of the InputFormat.
The problem here is that many methods rely on the previous Schema for assuring things like partitionBy fields are coherent. So the question is whether we should sample a Path to obtain the Schema first and perform the same checking or just not perform the checkings.
Current TableBuilder is nice but it only allows creating Tables from text or from binary Pangool Tuple files. It is not possible to use it with arbitrary InputFormats. As we are currently working in making Splout integrate easily with the Hadoop eco-system (Hive, Pig, Cascading) we should make TableBuilder more flexible and also add the possibility of implicit Schemas, so that if no Schema is provided it is read from the Tuples of the InputFormat.
The problem here is that many methods rely on the previous Schema for assuring things like partitionBy fields are coherent. So the question is whether we should sample a Path to obtain the Schema first and perform the same checking or just not perform the checkings.