data augmentations techniques are important for this task. Right now we have:
1) variation through variable names
2) variation through names, first, last, cities, streets, etc
3) perg
4) choiceg
5) variation through random choice of numerical values
seems that they all are around above 10K for one class. So maybe it would be nice if the framework could somehow aid the user to have at least that many example per class?
Are there other linguistic ways of changing sentences (maybe syntax) that keeps the same meaning but does alter the way the sentence looks?
For example, for augmentation of images its easy, rotations already provide useful way to do augment data sets easily.
data augmentations techniques are important for this task. Right now we have:
1) variation through variable names 2) variation through names, first, last, cities, streets, etc 3) perg 4) choiceg 5) variation through random choice of numerical values
to get an intuition the number of examples per class for imagenet: http://image-net.org/about-stats
seems that they all are around above 10K for one class. So maybe it would be nice if the framework could somehow aid the user to have at least that many example per class?
Are there other linguistic ways of changing sentences (maybe syntax) that keeps the same meaning but does alter the way the sentence looks?
For example, for augmentation of images its easy, rotations already provide useful way to do augment data sets easily.