marco-c / autowebcompat

Automatically detect web compatibility issues
Mozilla Public License 2.0
34 stars 41 forks source link

Out of memory error while training vgg16 and vgg19 with imagenet weights on Colab #280

Closed sdv4 closed 5 years ago

sdv4 commented 5 years ago

As in issue #191, when training with input shape (224, 224) and the following command line arguments:

!python3 train.py -n=vgg16 -o=sgd -ct='Y vs D + N' -bw='imagenet'

I receive the following error messages, followed by the process being killed automatically:

Using TensorFlow backend.
tcmalloc: large alloc 6457049088 bytes == 0x4e70000 @  0x7f0ddb5c3001 0x7f0dd8bf8b85 0x7f0dd8c5bb43 0x7f0dd8c5da86 0x7f0dd8cf5868 0x5030d5 0x507641 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x506393 0x634d52 0x634e0a 0x6385c8 0x63915a 0x4a6f10 0x7f0ddb1beb97 0x5afa0a
tcmalloc: large alloc 6457049088 bytes == 0x185ca6000 @  0x7f0ddb5c11e7 0x7f0dd8bf8a41 0x7f0dd8c5e7c0 0x7f0dd8c53ce5 0x7f0dd8cf64f3 0x5030d5 0x507641 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x502540 0x502f3d 0x506859 0x504c28 0x506393 0x634d52 0x634e0a 0x6385c8 0x63915a 0x4a6f10 0x7f0ddb1beb97 0x5afa0a
^C

It seems that another full dataset allocation is happening somewhere else, besides where it was fixed in #270.