jzlianglu / pykaldi2

Yet another speech toolkit based on Kaldi and PyTorch
MIT License
173 stars 33 forks source link

RIR format #2

Open singaxiong opened 5 years ago

singaxiong commented 5 years ago

Currently, the code assumes a very specific meta-data format for the RIRs. Will need to define a standard format that is flexible and easy to use.

By flexible, it should support variable number of metadata for each RIR. For example, some RIRs have information about room size, source-to-sensor distance, azimuth angle, reverberation time, etc, but some RIRs do not have any meta data. We need to be able to support both of them.

We also need to define the way to store the RIRs waveforms. One option is to store multiple multi-channel RIRs from different source positions to the same sensor(s) position(s) in one file, so we can have some of them for different speech sources, and some of them for directional noise sources.