tensorflow / swift-models

Models and examples built with Swift for TensorFlow
Apache License 2.0
648 stars 148 forks source link

Big Transfer #729

Closed acoadmarmon closed 3 years ago

acoadmarmon commented 3 years ago

Resubmitting with proper commit history for google-cla bot.

Hey S4TF Team!

I've re-implemented the Big Transfer (BiT): General Visual Representation Learning (https://arxiv.org/abs/1912.11370) paper in Swift for Tensorflow. Through this process, I've also added support for CIFAR-100 by slightly modifying the existing CIFAR-10 dataset, as well as implemented Mixup, a StandardizedConv2D layer, and more.

Let me know your thoughts on the code! I've done a lot of cleanup and worked to replicate the existing project structure / testing patterns.

Andrew Marmon

BradLarson commented 3 years ago

I don't mean to be a bother, but we're working towards finalizing the last pull requests before archiving the repository and I'd really like to have this in there. Do you believe it will be possible to complete this model by Friday? If so, that should give us one last chance to make a quick pass over it and pull it in.

acoadmarmon commented 3 years ago

@BradLarson @dan-zheng I have a lot of time today and tomorrow to finish this up. When is this repo scheduled to be archived? Will I still be able to get it in tomorrow?

Andrew

dan-zheng commented 3 years ago

Hi Andrew,

I believe Brad's currently OOO. I'm not sure about an exact archival time, but if you have time to complete the model by this Friday, that would be appreciated. I can update here if an archival time becomes known.

acoadmarmon commented 3 years ago

Sounds good. Thanks Dan!

acoadmarmon commented 3 years ago

Hey Dan,

With these new changes I've attempted to resolve your concerns around poorly named structs and lack of documentation. I also started work on removing the Python Tensorflow calls, but I've run into some issues. Specifically, when I try to use the _Raw.randomCrop function I get the error:

/swift-base/swift-apis/Sources/TensorFlow/Bindings/EagerExecution.swift:301: Fatal error: Op RandomCrop is not available in GraphDef version 561. It has been removed in version 8. Random crop is now pure Python. Illegal instruction

While debugging I found this implementation of RandomCrop per image which I would need to modify to include batches of data: https://github.com/tensorflow/swift-models/blob/5e5d7b4db27ada4c840fc6ba564125b62449706c/Datasets/Imagenette/ImageNet.swift#L208 . Is this the recommended way to randomly crop an image since it seems _Raw.randomCrop isn't available?

BradLarson commented 3 years ago

@acoadmarmon - I'd say that the above-linked implementation for cropping would probably be the easiest to use. Worst case, if you can't get random cropping working, feel free to comment it out and we'll bring in the rest of your functional code before the repository gets archived. Someone in the future might be able to extend it in a fork to complete the rest.

BradLarson commented 3 years ago

Also, I was able to fix the test failures that we've been seeing, so those should be out of the way whenever you feel comfortable with bringing this in.

acoadmarmon commented 3 years ago

Ok got it! I think I will comment it out for now and in the future me or someone else can finish this off so we can get it through before the repo is archived.

acoadmarmon commented 3 years ago

Ok I've pushed the latest changes that remove the use of Python TensorFlow from the training loop! Ready for review.

acoadmarmon commented 3 years ago

Thank you @BradLarson @dan-zheng! Appreciate the time you all took to review this and help clean it up 😄 .