Closed MinaKh closed 5 months ago
Hello @MinaKh! Currently I don't understand how what you're trying to achieve with this class cannot already be done with existing classes. It seems to me that you want to control the augmentations exactly, but then I don't understand why they're called augmentations. Can you please
Before I can merge this, this class would need a test as well, it might be helpful to add that as well.
Attention: 22 lines
in your changes are missing coverage. Please review.
Comparison is base (
db13037
) 76.80% compared to head (c9d26b0
) 77.34%. Report is 12 commits behind head on develop.
Files | Patch % | Lines |
---|---|---|
tonic/cached_dataset.py | 58.69% | 19 Missing :warning: |
tonic/audio_transforms.py | 92.00% | 2 Missing :warning: |
tonic/audio_augmentations.py | 98.78% | 1 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Hello @MinaKh! Currently I don't understand how what you're trying to achieve with this class cannot already be done with existing classes. It seems to me that you want to control the augmentations exactly, but then I don't understand why they're called augmentations. Can you please
- provide an example of how you use your proposed class
- explain with a concrete example why the current code cannot do what you need to do
Before I can merge this, this class would need a test as well, it might be helpful to add that as well.
Hi @biphasic! Thanks for your feedback.
docs/tutorails/Aug_DiskCachDataset.ipynb
and have addressed your raised point there with a synthetic dataset. Please let me know if it is not clear.
This branch added a child class for
DiskCachedDataset
calledAugDiskCachedDataset
. Its main use is for a family of so-called deterministic augmentations with a rather discrete parameter space. For instance a noise augmentation on audio samples in which SNR can have only 5 values.DiskCachedDataset
num_copies can be used to generate N copies of a data sample. This is ok when used transforms/augmentations have an infinite/probabilistic parameter space. So the chance of generating repetitive augmented versions is very low.