Selective data simulation

Currently, we have a single configuration called "simulation_prob" to control how many percentages of the source speech to apply data simulation. We need more control over selective data simulation.

Specifically, we need to estimate the quality of the speech signal and only apply simulation to those high-quality ones. As a first step, we can use SNR estimation. We can estimate it offline and store it as one type of meta-data. The simulation module should use this information to decide whether to apply simulation.

jzlianglu / pykaldi2

Selective data simulation #4