taki0112 / UGATIT

Official Tensorflow implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation (ICLR 2020)
MIT License
6.17k stars 1.04k forks source link

Pretrained archive is corrupted #50

Open KR3ND31 opened 5 years ago

KR3ND31 commented 5 years ago

I made a copy on my disk, and after trying to download using curl(like here), but this is not a matter of results, the archive is still corrupted

abhinavdimri commented 5 years ago

Doing a unzip model.zip gives me this

Screenshot 2019-08-15 at 8 55 41 PM
WinHGGG commented 5 years ago

In windows 10 x64,using 7zip,it can be decompressed correctly. But the progress bar is broken.103/100 TIM图片20190815215037 TIM截图20190815233147

It can be loaded correctly. TIM截图20190815233400

neuralphene commented 5 years ago

Confirmed that 7zip in Windows extracted the file. It still indicated there were header errors, and my progress bar went to 150%+. I assume the zip file IS actually corrupt, just that 7zip handles it more gracefully. The extract checkpoint image is 7.5G now.

It still fails to load, however. Where @WinHGGG gets Load SUCCESS, I am seeing Load failed.... As a sanity check, what dir structure and parameters are you using to load the checkpoint @WinHGGG?

buckley-w-david commented 5 years ago

@neuralphene It's very particular about the exact name of the subdirectory within checkpoint directory, from the UGATIT directory

(.venv) [david@darvid-pc UGATIT]$ ls -s checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/*
      1 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/checkpoint
7860716 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.data-00000-of-00001
     32 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.index
  14708 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.meta
(.venv) [david@darvid-pc UGATIT]$ python main.py --dataset selfie2anime --phase test
.
.
.
 [*] Success to read UGATIT.model-1000000
 [*] Load SUCCESS
 [*] Test finished!
neuralphene commented 5 years ago

@buckley-w-david my invocation looks much like yours. Note a slight discrepancy for a couple file sizes.

dev@f0c8cf23125f:~/UGATIT$ ls -s checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/*
7860720 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.data-00000-of-00001
     32 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.index
  14708 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.meta
      4 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/checkpoint
dev@f0c8cf23125f:~/UGATIT$ python main.py --dataset selfie2anime --phase test
.
.
.
 [*] Reading checkpoints...
 [*] Failed to find a checkpoint
 [!] Load failed...
buckley-w-david commented 5 years ago

Well isn't that just weird.

If you know your way around Python I'd throw in some breakpoints in the load method in UGATIT.py (Around line 600) and first check if it's correctly detecting your checkpoint directory, then move on to see if there's some problem with your checkpoint file, since that error message seems to happen if it either can't find or fails when trying to load that file.

Can you paste the contents of the checkpoint file?

suedroplet commented 5 years ago

@buckley-w-david Hi. I successfully extracted the file and I think I have already put it in a right path. But the model can't load the checkpoint. Here is my checkpoint content: image image

suedroplet commented 5 years ago

@buckley-w-david Fine. I got it. Just remove the --light parameter.

TheGuywithTheHat commented 5 years ago

Anyone manage to extract it on linux? Archive Manager, jar, 7z, and unzip all failed for me.

creke commented 5 years ago

@neuralphene It's very particular about the exact name of the subdirectory within checkpoint directory, from the UGATIT directory

(.venv) [david@darvid-pc UGATIT]$ ls -s checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/*
      1 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/checkpoint
7860716 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.data-00000-of-00001
     32 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.index
  14708 checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.meta
(.venv) [david@darvid-pc UGATIT]$ python main.py --dataset selfie2anime --phase test
.
.
.
 [*] Success to read UGATIT.model-1000000
 [*] Load SUCCESS
 [*] Test finished!

Could you post the md5 of the zip you've downloaded? Thank you.

banderlog commented 5 years ago

So it is still not reproducible? :))

neuralphene commented 5 years ago

Could you post the md5 of the zip you've downloaded? Thank you.

The md5sum is 1acedc844eca4605bad41ef049fba401. I've download the file several times via several different methods, and get the same md5sum.

My checkpoint file contains the same contents as @suedroplet

model_checkpoint_path: "UGATIT.model-1000000"
all_model_checkpoint_paths: "UGATIT.model-996000"
all_model_checkpoint_paths: "UGATIT.model-997000"
all_model_checkpoint_paths: "UGATIT.model-998000"
all_model_checkpoint_paths: "UGATIT.model-999000"
all_model_checkpoint_paths: "UGATIT.model-1000000"
neuralphene commented 5 years ago

I've figured out what my issue in loading the model was: I had to explicitly add the --light=false parameter to the invocation. With that, it successfully loads the model and generates valid results.

CarlosLannister commented 5 years ago

Just use 7zip on Windows and make sure that you have enough free ram.

neuralphene commented 5 years ago

@CarlosLannister whlie 7zip on Windows works, it still indicated an invalid header, which means the file is corrupt. Also, there are people out there that don't have access to Windows.

banderlog commented 5 years ago

Files has "Copy" part because I've downloaded them as it was proposed in #49 (via copying inside google drive).

Output of unzip -t Copy\ of\ 100_epoch_selfie2anime_checkpoint.zip:

Archive:  Copy of 100_epoch_selfie2anime_checkpoint.zip
warning [Copy of 100_epoch_selfie2anime_checkpoint.zip]:  4294967296 extra bytes at beginning or within zipfile
  (attempting to process anyway)
file #1:  bad zipfile offset (local header sig):  4294967296
  (attempting to re-compensate)
    testing: checkpoint/              OK
    testing: checkpoint/.DS_Store     OK
    testing: __MACOSX/                OK
    testing: __MACOSX/checkpoint/     OK
    testing: __MACOSX/checkpoint/._.DS_Store   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/checkpoint   OK
    testing: __MACOSX/checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/   OK
    testing: __MACOSX/checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/._checkpoint   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.index   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.data-00000-of-00001  
  error:  invalid compressed data to inflate
file #12:  bad zipfile offset (local header sig):  702817526
  (attempting to re-compensate)
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-1000000.meta   OK
At least one error was detected in Copy of 100_epoch_selfie2anime_checkpoint.zip.

Output of unzip -t Copy\ of\ 50_epoch_selfie2anime_checkpoint.zip:

Archive:  Copy of 50_epoch_selfie2anime_checkpoint.zip
warning [Copy of 50_epoch_selfie2anime_checkpoint.zip]:  4294967296 extra bytes at beginning or within zipfile
  (attempting to process anyway)
file #1:  bad zipfile offset (local header sig):  4294967296
  (attempting to re-compensate)
    testing: checkpoint/              OK
    testing: checkpoint/.DS_Store     OK
    testing: __MACOSX/                OK
    testing: __MACOSX/checkpoint/     OK
    testing: __MACOSX/checkpoint/._.DS_Store   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-500001.meta   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/checkpoint   OK
    testing: __MACOSX/checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/   OK
    testing: __MACOSX/checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/._checkpoint   OK
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-500001.data-00000-of-00001  
  error:  invalid compressed data to inflate
file #12:  bad zipfile offset (local header sig):  651653367
  (attempting to re-compensate)
    testing: checkpoint/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing/UGATIT.model-500001.index   OK
At least one error was detected in Copy of 50_epoch_selfie2anime_checkpoint.zip.
banderlog commented 5 years ago

unzip Copy\ of\ 100_epoch_selfie2anime_checkpoint.zip was successful. But Cumulative size of extracted data was ~750M.

Above described message warning [Copy of 100_epoch_selfie2anime_checkpoint.zip]: 4294967296 extra bytes at beginning or within zipfile allegedly means that someone added 4GB blank brick to each archive. Why?

buckley-w-david commented 5 years ago

It probably didn't extract correctly. As the issue states, the archive is technically corrupt. As I understand the spec for zip files says the max size for both the archive and any files inside it is 4gb, the largest file in the 100 epoc file (didn't check the 50 one) is ~7.5gb, thus not compliant with the standard.

The tools I've seen people try from linux are compliant with the standard, and fail to successfully extract. 7zip on Windows seems to be more forgiving though, and breaks compliance with the standard to successfully extract the contents.

TheGuywithTheHat commented 5 years ago

Alright, 7zip on wine appears to have worked.

neuralphene commented 5 years ago

I have created an LFS repository that contains the dataset and checkpoint provided. You can clone my repository and use the scripts provided to work with this repository. See the README for more info. Quick note: Linux only.

I tried to fork this repository and just add the dataset/checkpoint to it, but GitHub doesn't support adding LFS files to public forks.

buckley-w-david commented 5 years ago

Alright, I've done what probably should have been done in the first place and created a torrent for the data (at least what I downloaded, which is the 100 epoc version).

I won't seed it all the time, or indefinitely, but go nuts https://www.dropbox.com/s/a27g94g5s731xts/UGATIT_selfie2anime_lsgan_4resblock_6dis_1_1_10_10_1000_sn_smoothing.torrent?dl=1

Also if you download it, please seed

neuralphene commented 5 years ago

@buckley-w-david does the torrent have the corrupt zip, or the recompressed, split into 1G parts, tgz from my repo? I figured more people would have success with the tgz split up (and not corrupt) as compared to the zip.

buckley-w-david commented 5 years ago

Neither, I figured I'd cut out the annoying part that people are having trouble with and just go with the data uncompressed.

Bad on size, easier to do.

I'd advise people to go with @neuralphene data if they can, since it'll be a smaller download, and significantly quicker since my upload speed is garbage

neuralphene commented 5 years ago

@buckley-w-david ah, good idea. I was actually hoping for a torrent from the start. I went with the repo because my upload speed was garbage. Options are good!

banderlog commented 5 years ago

@neuralphene thank you, at lat I was able to see is it NN so good as they wrote about it.

And no, it is not :)

Without judging the architecture or code quality or the way the authors provided data and pretrained model. Just NN's output. The pictures in paper and results from test_dataset are awesome. Others not.

E.g.: image

neuralphene commented 5 years ago

Just got an email from GitHub. They disabled LFS on the repo I put up. Apparently there's a 1GB limit. Sounds like the torrent is the way to go for those still looking to get the dataset / checkpoint.

vsalavatov commented 5 years ago

I managed to get the uncompressed data from another sources and in about an hour I'll start seeding it via torrent provided by @buckley-w-david. (now speed is 0, everybody got just 16% of data)

UPD: 10 MB/s upload, hooray!

buckley-w-david commented 5 years ago

Your help is very appreciated @RemmargorP

As I said, my upload is crap and I don't have a seed box to leave running at all times

ghost commented 5 years ago

@banderlog I think you need to crop a significant amount of background for better results. In the example images you only see the face and some hair, which is what it's been trained on

I'm also going to seed the torrent, as soon as I can

xiaotaw commented 5 years ago

It probably didn't extract correctly. As the issue states, the archive is technically corrupt. As I understand the spec for zip files says the max size for both the archive and any files inside it is 4gb, the largest file in the 100 epoc file (didn't check the 50 one) is ~7.5gb, thus not compliant with the standard.

The tools I've seen people try from linux are compliant with the standard, and fail to successfully extract. 7zip on Windows seems to be more forgiving though, and breaks compliance with the standard to successfully extract the contents.

It seems only work on MacOS, but I use Ubuntu. Any solutions?

I came across the same problem:

warning [100_epoch_selfie2anime_checkpoint.zip]:  4294967296 extra bytes at beginning or within zipfile
  (attempting to process anyway)
file #1:  bad zipfile offset (local header sig):  4294967296
buckley-w-david commented 5 years ago

Try 7zip via wine, someone reported that that worked.

Alternatively download the already extracted contents via the torrent link I posted earlier

PeterBorisenko commented 5 years ago

Win 10 x64 7zip does not working image

Someone please seed the torrent!

buckley-w-david commented 5 years ago

From what I can see, there are currently 4 seeders on the torrent

xiaotaw commented 5 years ago

A friend's unziped it successfully on Mac, and transferred the file to me.

FantasyJXF commented 5 years ago

Use default zip APP on MAC solve the problem, really strange BUG.

FEIYANGDEXIE commented 5 years ago

so just unzip on mac? also loaded failed. do i need creat a directory called checkpoint before doing unzip?

FEIYANGDEXIE commented 5 years ago

maybe just doenload again and use 7z.but my gpu is out of memory.....

benonilearns commented 4 years ago

I'm currently downloading from torrent and will upload a mega and a google drive link so people on Windows/Linux won't have to deal with the corrupt archive. If anyone wants to join the seeders (there are 3 now) it would be very welcome.

benonilearns commented 4 years ago

As is from torrent: Mega Link Gdrive Link

benonilearns commented 4 years ago

tar.xz: Mega Link Gdrive Link

brendon-boldt commented 4 years ago

The files @benonilearns linked to worked for me, though I did need to change checkpoint.txt to checkpoint (possibly due to my using TensorFlow 1.14.0).