Memory issues for network construction (i.e. nearest neighbor computation)

Hi Will,

Back to you with some memory issues. My experience so far is that SocialSent runs into memory problem when you reach a threshold of more or less 7000 words to score. So I ran it on a distributed architecture (shartcnet) with 38000 words to score and ask for 16G memory, yet it very soon runs out of memory again:

... Using Theano backend. /opt/sharcnet/python/2.7.8/intel/lib/python2.7/site-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead DeprecationWarning) Evaluating SentProp with 100 dimensional GloVe embeddings Evaluating binary and continuous classification performance LEXICON SEEDS EMBEDDINGS EVAL_WORDS Traceback (most recent call last): File "concreteness.py", line 95, in sym=True, arccos=True) File "/home/genereum/socialsent-master/polarity_induction_methods.py", line 99, in random_walk M = transition_matrix(embeddings, **kwargs) File "/home/genereum/socialsent-master/graph_construction.py", line 62, in transition_matrix return Dinv.dot(L).dot(Dinv) MemoryError --- SharcNET Job Epilogue --- job id: 12138822 exit status: 1 cpu time: 313s / 12.0h (0 %) elapsed time: 479s / 12.0h (1 %) virtual memory: 11.9G / 16.0G (74 %)

Job returned with status 1. WARNING: Job only used 1 % of its requested walltime. WARNING: Job only used 0 % of its requested cpu time. WARNING: Job only used 65 % of allocated cpu time. WARNING: Job only used 74% of its requested memory. ...

A solution would be to run it 7000 words at time. But maybe you know a way to increase the memory use by the program?

Thanks, Michel

williamleif / socialsent

Memory issues for network construction (i.e. nearest neighbor computation) #2