tensorflow / swift-models

Models and examples built with Swift for TensorFlow
Apache License 2.0
646 stars 146 forks source link

Pix2 pix checkpoint #696

Closed inukshukdeveloper closed 3 years ago

inukshukdeveloper commented 3 years ago

This is draft PR that follows the checkpoint pattern established by other models (e.g. GPT2). This work wasn't planned on being formally submitted but some of it may be useful depending on where S4TF is headed. It only writes checkpoints for the generator half of the GAN but obviously could be extended to the discriminator as well.

There are some other utility odds and ends (e.g. extension to Image to conform to Equatable) that may be of interest.

If any (or none) of the PR is suitable for inclusion, feedback is certainly welcome.

Several TODOs are also sprinkled throughout to show obvious remaining holes.

BradLarson commented 3 years ago

It's unfortunate that I've been focused on other work and haven't been able to push this PR across the finish line, because that would remove a large portion of the boilerplate code here. You can see in that PR how the GPT-2 example is simplified, for example.

I haven't been pushing forward with other checkpoint-related extensions to models until that PR is completed, but I know it would be nice to have this available in many models here.

All that said, would you still like for us to continue to review the finer components of this PR?

inukshukdeveloper commented 3 years ago

No problem. I've been checking in on the progress of PR 631 and suspected portions of this PR will be redundant once PR 631 appears.

Yes, if you feel other portions of this draft PR would be useful, please take a look. I welcome any feedback.

Cheers, Mark

On Wed, Nov 4, 2020 at 7:24 AM Brad Larson notifications@github.com wrote:

It's unfortunate that I've been focused on other work and haven't been able to push this PR https://github.com/tensorflow/swift-models/pull/631 across the finish line, because that would remove a large portion of the boilerplate code here. You can see in that PR how the GPT-2 example is simplified, for example.

I haven't been pushing forward with other checkpoint-related extensions to models until that PR is completed, but I know it would be nice to have this available in many models here.

All that said, would you still like for us to continue to review the finer components of this PR?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/swift-models/pull/696#issuecomment-721797251, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJRYIUV2XTWGEW3KNXQAUQ3SOFW4HANCNFSM4TICWWLQ .

inukshukdeveloper commented 3 years ago

Procedural question: should I mark this draft PR as ready to review or will it be reviewed in its draft state?

Thanks, Mark

inukshukdeveloper commented 3 years ago

Submitting for review. It's understood a good portion of this PR is redundant pending completion of PR 631 but portions could be of use now. Also, TODO items may be useful but would need feedback as to whether finishing them would be worthwhile.

inukshukdeveloper commented 3 years ago

Thanks for the review. I'll begin looking at incorporating the feedback here shortly.

BradLarson commented 3 years ago

We'd like to get our last pull requests in within the next couple of days, in order to cleanly archive swift-models. Is there a way to re-scope this in time to use the new checkpointing API and simply incorporate your fixes? If so, we could bring that in before archiving.

I totally understand if that would take too much time, given the short notice we've provided. If it can't be done in time, I still appreciate the effort that you put in to enhance this.

inukshukdeveloper commented 3 years ago

I can give it a shot. Checkpointable went onto main just recently, right?

-Mark

On Tue, Feb 16, 2021 at 7:32 AM Brad Larson notifications@github.com wrote:

We'd like to get our last pull requests in within the next couple of days, in order to cleanly archive swift-models. Is there a way to re-scope this in time to use the new checkpointing API and simply incorporate your fixes? If so, we could bring that in before archiving.

I totally understand if that would take too much time, given the short notice we've provided. If it can't be done in time, I still appreciate the effort that you put in to enhance this.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tensorflow/swift-models/pull/696#issuecomment-779914282, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJRYIUX7DGGDA4KAFJYA3GDS7KFZTANCNFSM4TICWWLQ .

inukshukdeveloper commented 3 years ago

Hey Brad,

I just tried a model checkpoint write/read round trip and got a fatal error where the number of key paths does not equal the number of tensor names (58 vs 84 in my case). I simply wrote the checkpoint and read it back in without any customization. I haven't started digging into it yet but I thought I'd see if there is something obvious I'm missing.

-Mark

brettkoonce commented 3 years ago

@inukshukdeveloper you might look at some of the notes in #733

inukshukdeveloper commented 3 years ago

@brettkoonce Thanks! Working much better now.

inukshukdeveloper commented 3 years ago

Looks like GPT2 tests are failing to initialize when trying to read the checkpoint index file. The location appears to be the same in the branch and main (https://storage.googleapis.com/gpt-2/models/117M/model.ckpt.index) but the load is failing. Did I miss something in the merge?