std::bad_alloc in tensorflow

michalstepniewski commented 5 years ago

I am running the code on AWS r5a.xlarge machine with | Deep Learning AMI (Ubuntu) Version 14.0 - ami-0089d61bf6a518044-- | --

The machine has 32GB RAM and 4 vCPUs. I run into:

2019-05-17 16:24:17.915477: W tensorflow/core/framework/allocator.cc:124] Allocation of 3328000000 exceeds 10% of system memory.

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

and the running seems to stall.

emchristiansen commented 5 years ago

You're probably running out of memory. Can you try on a 64GB instance? Also, consider monitoring swap usage to make sure you're not thrashing.

On Fri, May 17, 2019 at 9:40 AM michalstepniewski notifications@github.com wrote:

I am running the code on AWS r5a.xlarge machine with | Deep Learning AMI (Ubuntu) Version 14.0 - ami-0089d61bf6a518044-- | --

The machine has 32GB RAM and 4 vCPUs. I run into:

2019-05-17 16:24:17.915477: W tensorflow/core/framework/allocator.cc:124] Allocation of 3328000000 exceeds 10% of system memory.

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

and the running seems to stall.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/google/in-silico-labeling/issues/3?email_source=notifications&email_token=AABXDBAYYAED7L5VHR5N76LPV3NYXA5CNFSM4HNWZJW2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUOGRRQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AABXDBGTJZBU7VKTMM5UPULPV3NYXANCNFSM4HNWZJWQ .

michalstepniewski commented 5 years ago

thanks for prompt reply. I am running on AWS on 61GB RAM and it's working so far so apparently 32GB is not enough :)

emchristiansen commented 5 years ago

Thanks for the feedback, updated the README.

On Fri, May 17, 2019 at 11:13 AM michalstepniewski notifications@github.com wrote:

thanks for prompt reply. I am running on AWS on 61GB RAM and it's working so far so apparently 32GB is not enough :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/google/in-silico-labeling/issues/3?email_source=notifications&email_token=AABXDBAWS4KTLVYHPWKMXPTPV3YTDA5CNFSM4HNWZJW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVVPJFQ#issuecomment-493548694, or mute the thread https://github.com/notifications/unsubscribe-auth/AABXDBBSKI2HY7XQGAOKR23PV3YTDANCNFSM4HNWZJWQ .

google / in-silico-labeling

std::bad_alloc in tensorflow #3