Open dreamflasher opened 5 years ago
Training DNNs uses random resize, crop, rotation, etc. for data augmentation. How do you do this with jpeg2dct?
I guess an easy solution is just to first resize ,crop,rotation the normally decoded jpeg picture and then encode it again
That's unfortunately not a solution. DNN training requires many random transformations, so that can't be precomputed in a meaningful way.
That's unfortunately not a solution. DNN training requires many random transformations, so that can't be precomputed in a meaningful way.
I don't see why it can't be done in real time, the process decode normally -> data augmentation -> encode normally -> jpeg2dct should't take a lot of time.
Of course a more clever way is to somehow map the data augmentation process to the "input image" after jpeg2dect.
I think you don't understand the point of jpeg2dct… the whole point is to save the time of the decoding/encoding :)
I think you don't understand the point of jpeg2dct… the whole point is to save the time of the decoding/encoding :)
I understand the purpose of is to save time of decoding/encoding, during inference time. all the extra encoding and decoding due to data augmentation should only happen in training, since data augmentation should only be adopted in training. Therefore the extra time wasted in encoding/decoding when inference is zero.
Speeding up training is relevant, and that's what I personally care about.
Speeding up training is relevant, and that's what I personally care about.
Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct
ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?
Speeding up training is relevant, and that's what I personally care about.
Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct
ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?
Hi, I recently started experimenting with this library, and I could not get close to reproducing the results reported in the original paper (I also opened an issue asking for a training script or a trained model). Have you made any progress with this since posting the comment?
Speeding up training is relevant, and that's what I personally care about.
Well then there maybe some way to map the data augmentation to the "image" after jpeg2dct ps: have you done any experiments on jpeg2dct? Is the outcome really as good as the origin paper said?
Hi, I recently started experimenting with this library, and I could not get close to reproducing the results reported in the original paper (I also opened an issue asking for a training script or a trained model). Have you made any progress with this since posting the comment?
No, sorry. Zero progress
Speeding up training is relevant, and that's what I personally care about.
Dear @dreamflasher, encoding/decoding is not the part of the process speeding up the inference. DCT is already a compressed form. In order to compress the given image to an equivalent size that DCT has, you have to use a reasonable amount of convolutional layers which require heavy computational need. The main idea of this work is using already compressed image form to avoid 1st and 2nd blocks in ResNet which involves lots of Convolutional layers.
Training DNNs uses random resize, crop, rotation, etc. for data augmentation. How do you do this with jpeg2dct?