Support for float16 training

This PR enables ability to train FRNN on float16 half-precision floats. Main changes.

Introduce a custom MPI_Datatype of 2 contiguous bytes, and save it in the type dictionary:
```
mpi_float16 = MPI.BYTE.Create_contiguous(2).Commit()
MPI._typedict['e'] = mpi_float16
```

Define a custom reduction operation for the new type:

def sum_f16_cb(buffer_a, buffer_b, t):
assert t == mpi_float16
array_a = np.frombuffer(buffer_a, dtype='float16')
array_b = np.frombuffer(buffer_b, dtype='float16')
array_b += array_a

mpi_sum_f16 = MPI.Op.Create(sum_f16_cb, commute=True)

Introduce a switch in the code based on the Keras floatX variable:

if K.floatx() == 'float16':
self.comm.Allreduce(arr,arr_global,op=mpi_sum_f16)
else:
self.comm.Allreduce(arr,arr_global,op=MPI.SUM)

Other side changes:

Move configuration files to plasma/conf_parser.py, cleanup.

PPPLDeepLearning / plasma-python

Support for float16 training #15