Closed sumuzhao closed 4 years ago
hi, the cos similarity matrix consumes about 30 GB RAM, which caused your out of memory problem. Do you have a larger RAM machine? Or you can also convert the float precision from 64 bits to lower one, say 32 bit or 16 bit.
Well...I'll try to reduce the float precision. But I don't think it can work due to my low RAM... I'll think if there are any alternatives for this, such as reduce the size of the vocabularies... Anyway, thanks for your suggestion.
yes, you can also shrink the vocab size.
Well...I'll try to reduce the float precision. But I don't think it can work due to my low RAM... I'll think if there are any alternatives for this, such as reduce the size of the vocabularies... Anyway, thanks for your suggestion.
While reducing the precision by using the following line: df = df.astype(np.float32) But i get the following error: ValueError: could not convert string to float: 'tt0000574'. What should be done?
May I know where this line is used? I am not sure what "df" here refers to. Thanks!
Hi,
I tried to pre-calculate the cosine similarity scores based on the counter-fitting word vectors, but met the Memory Error problems. The word vectors are (65713, 300) and finally the similarity matrix is (65713, 65713). There are some dot and element-wise division operations. I got 8G RAM. Any suggestions?
Thanks a lot!