tensorflow / tpu

Reference models and tools for Cloud TPUs.
https://cloud.google.com/tpu/
Apache License 2.0
5.21k stars 1.77k forks source link

How to train efficientnet on CIFAR-10 or CIFAR-100? Image size is 32x32. #421

Open ypw-rich opened 5 years ago

ypw-rich commented 5 years ago

The default model input size is 224~600. What adjustments should I make to fit CIFAR-10's 32x32?

def efficientnet_params(model_name):
  """Get efficientnet params based on model name."""
  params_dict = {
      # (width_coefficient, depth_coefficient, resolution, dropout_rate)
      'efficientnet-b0': (1.0, 1.0, 224, 0.2),
      'efficientnet-b1': (1.0, 1.1, 240, 0.2),
      'efficientnet-b2': (1.1, 1.2, 260, 0.3),
      'efficientnet-b3': (1.2, 1.4, 300, 0.3),
      'efficientnet-b4': (1.4, 1.8, 380, 0.4),
      'efficientnet-b5': (1.6, 2.2, 456, 0.4),
      'efficientnet-b6': (1.8, 2.6, 528, 0.5),
      'efficientnet-b7': (2.0, 3.1, 600, 0.5),
  }
  return params_dict[model_name]

https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/efficientnet_builder.py#L28-L41

@mingxingtan

jiading-zhu commented 5 years ago

I was going to ask the same question because I've also been trying to run efficientnet on CIFAR-10. From some comments in tpu/models/official/efficientnet/imagenet_input.py , it appears that the required input data format is generated from https://github.com/tensorflow/tpu/blob/master/tools/datasets/imagenet_to_gcs.py .

Since from the paper it appears that you guys have also run efficientnet models on CIFAR-10, do you guys have a similar script that processes CIFAR-10 data? If so, do you plan to release it as well?

@mingxingtan

mingxingtan commented 5 years ago

I was using the same image sizes as ImageNet for all transfer learning experiments for the paper. Will try to open source transfer learning scripts soon.

ngoanpv commented 5 years ago

@mingxingtan Does any update on transfer learning script that you test on CIFAR-10 mentioned on paper? Thanks.

raminrasoulinezhad commented 5 years ago

Hi @mingxingtan, I also need to understand your method for other datasets. The EfficientNet-B0 can not tolerate 32x32 images since the strides and kernel sizes are designed for Imagenet size images. At least the average pooling in stage 9 should be removed. Am I right?

What does it mean using the same architecture on low-resolution images? Am I right that your script maps each 32x32 image to 224x224 image using padding, upsampling,...? Thanks a lot.

mingxingtan commented 5 years ago

You are right, 32x32 images are too small for EfficientNets. You just need to upsample using tf.image.resize_images.

raminrasoulinezhad commented 5 years ago

Hi @mingxingtan ,

Cool. So, can you confirm that 1) you trained EfficientNet-B0 using ImageNet, 2) Then you prepared the CIFAR10 images for your network (using just upsampling). 3) Then, by freezing the convolution layers (even the first one which applies on input image), and retraining the final FC layer (which is smaller now), you reported the accuracy for CIFAR10/100?

Thanks a lot. Sincerely,

mingxingtan commented 5 years ago

I did not freeze anything. Just fine-tuned all weights.

mingxingtan commented 5 years ago

Better to finetune the entire network (don't freeze any weights).

raminrasoulinezhad commented 5 years ago

Hi @mingxingtan ,

Thanks a lot. Is it possible to provide us the script as well?

serser commented 4 years ago

Does anyone know whether the CIFAR script is published?

ChawDoe commented 4 years ago

@mingxingtan I train the efficientnet-b0 on CIFAR-100. I set initial lr to 0.1, and multiplt it by 0.1 every 3 epoches. But I get acc@85.62%. the result on the paper is 88.1%. Could you tell me the detailed settings?

5663015 commented 4 years ago

@mingxingtan I found that you would provide the code of transfer learning. I would like to know when the script be open? Or could you provide the details of transfer learning settings? Thanks a lot. Sincerely,

PhilipMay commented 4 years ago

I would also be interested in the scripts.

I did not freeze anything. Just fine-tuned all weights.

As far as I know is that when you replace the fully connected part and finetune then the updates are so hard that they destroy the learned weights of the CNN part.

chetnakhanna16 commented 3 years ago

Please check out this article to perform transfer learning on the CIFAR-100 dataset using EfficientNet-B0. https://towardsdatascience.com/cifar-100-transfer-learning-using-efficientnet-ed3ed7b89af2

siarez commented 2 years ago

I'm not interested in transfer learning. I want to train EfficientNet-b5 on Cifar100 from scratch.
My best hyper-parameters so far have have been: SGD with momentum=0.9, Batch size=256, lr=0.01, L2_weight_decay=1e-5 I haven't managed to get higher than %60 test accuracy! Is that expected? or are my hyper-parameters out of whack? Besides the poor test performance, it is also much slower to converge than a standard ResNet50

Hugo101 commented 1 year ago

@siarez Hi, I also have the same issue. Did you figure this out? Thank you!

siarez commented 1 year ago

@Hugo101 if I remember correctly the issue was fixed after I scaled up the input images to 224x224.

Hugo101 commented 1 year ago

@Hugo101 if I remember correctly the issue was fixed after I scaled up the input images to 224x224.

Got it. Thank you!