ivrit-ai / ivrit.ai

ivrit.ai codebase
MIT License
24 stars 9 forks source link

Train and Evaluate Models on Noise-Filtered Data #37

Open yairl opened 2 months ago

yairl commented 2 months ago
  1. Train a model by applying noise filtering/reduction on training data (such as denoiser or other tools/models)
  2. Apply noise filtering/reduction on test data, then run inference on that with the new model
  3. Evaluate impact on transcription accuracy
yanirmr commented 1 month ago

Is your feature request related to a problem? Please describe. The current model training and evaluation processes do not account for noise in the data, which can impact transcription accuracy. Training and testing on noise-filtered data may improve the model's performance in real-world noisy environments.

Describe the solution you'd like Train a model using noise-filtered/reduced training data by applying denoising techniques or tools. Additionally, apply noise filtering/reduction to the test data before running inference with the new model. Finally, evaluate the impact of noise filtering on transcription accuracy.

Describe alternatives you've considered An alternative solution could be to augment the training data with synthetic noise to make the model more robust to noisy environments.

Additional context Implementing this feature will require integration of noise filtering tools or models into the training and testing pipelines. This enhancement aims to improve transcription accuracy, especially in noisy conditions, by evaluating the effectiveness of noise reduction techniques.