AdityaKane2001 / regnety

Implementation of RegNetY in TensorFlow 2
Apache License 2.0
20 stars 1 forks source link

Beyond GSoC #19

Closed AdityaKane2001 closed 3 years ago

AdityaKane2001 commented 3 years ago

@sayakpaul @MorganR

This issue consolidates future potential ideas for this repository. If I've left out something we've already discussed, please comment below. It would be really great if you could share what you'd like to see here. Regarding training of models, I have TPU access for few more days, so we can definitely take advantage of that.

sayakpaul commented 3 years ago

Might be also a good idea to implement other variants of RegNets to make it easier for the community to benchmark different models: RegNet-X, RegNet-Z.

I would suggest try including RegNet-Y inside tf.keras.applications rather than making it a package. That way the impact will be even more.

Training with Noisy Student sounds good but if your purpose is to distill smaller models then this will be a better alternative.

sayakpaul commented 3 years ago

@AdityaKane2001 I was going through this list again. I have a few questions.

Sorry if you had already considered this, but do you plan to look into the model implementation again to make the blocks more open and customizable? Like implement the blocks as tf.keras.Model blocks as opposed to tf.keras.layers.Layer blocks?

We had also talked about re-debugging things to try to match the original performance in v2. Do you have that planned as well?

AdityaKane2001 commented 3 years ago

implement the blocks as tf.keras.Model

Yes that is something I want to do. It just slipped my mind amidst the PR and other things. This is definitely a priority.

re-debugging things to try to match the original performance in v2.

I guess we can say it is not possible to do this using the setup in the paper. Since we've scrutinized this a lot during training, I think it'd be best to look at different training methods with a fresh mind over the next week. I asked you about self supervision and knowledge distillation earlier for the same reason, to squeeze out every drop of accuracy out of this.

But going forward, while the credits are available I'd like to train the rest of the models, albeit with suboptimal performance. And to continue experiments for accuracy parallelly. Models: 8x RegNetY, 12xRegNetX, 12xRegNetZ

Also planning to convert existing models to TFLite this week.

sayakpaul commented 3 years ago

RegNets yield superior performance with self-supervision when trained on a larger dataset. Look at the SEER paper from FAIR.

As for other training methods, I would incorporate RandAugment for data preprocessing and Sharpness-Aware Minimisation for training the models.

AdityaKane2001 commented 3 years ago

As suggested by @sayakpaul I've removed the branch protection on main. I have also created a branch archive_gsoc which still requires one approval. Keeping that branch as a snapshot of the current state of the repo.

AdityaKane2001 commented 3 years ago

@sayakpaul

Going ahead, I think it would be best to keep Keras applications as a target as I have TPU access for the next ~40 days. I will open an issue in Keras repo as it'd be better to share this with them before we start training, in case there's a need to refactor code (which will most probably be the case).

sayakpaul commented 3 years ago

If need be, you can always apply for TPU credits via TRC directly.

AdityaKane2001 commented 3 years ago

See #15419.