Closed lfoppiano closed 2 years ago
I've never trained glove, so I can't give a definitive answer, but I inherited this like Aunt May's old antique chair which I need to do something with.
Taking forever and doing things very slowly sounds like classic thrashing behavior. Is it using a lot of swap space and/or maxxed out on the 80G you gave it?
I understand the situation. Thanks anyway for answering. :-)
I checked and things seem fine (no swapping, some memory is still free). However, after checking this I stopped and restarted with verbose = 0 and with more memory, although I don't think the memory was the issue there.
I can confirm that restarting the process with verbose=0
and more memory finished in much less time
Awesome, thanks!
On Sun, Oct 3, 2021 at 10:43 PM Luca Foppiano @.***> wrote:
I can confirm that restarting the process with verbose=0 and more memory finished in much less time
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/GloVe/issues/196#issuecomment-933160290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWMKX6LMNPQH7HD23QDUFE5GXANCNFSM5DTLKARA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hi all, it's been a while now that I've been trying to train gloVe on a large dataset (1.2Tb).
The script successfully created the vocabulary but it's been like two month that it's running on the cooccurrence extraction:
$BUILDDIR/cooccur -memory $MEMORY -vocab-file $VOCAB_FILE -verbose $VERBOSE -window-size $WINDOW_SIZE < $CORPUS > $COOCCURRENCE_FILE
I can see the file
cooccurrence.bin
growing slowly, but I was wondering if it is normal that it's running for such long time?-rw-r--r-- 1 lfoppian0 tdm 631G Sep 8 08:08 cooccurrence.bin
Thank you in advance
For information, I'm attaching the modified the script
demo.sh
. Mainly I changed memory=80Gb, CPUs=72 and modified various paths and file names: