Open vishvanath45 opened 6 years ago
Hi @vishvanath45 , I am working on document and term clustering with SOM, and I was actually using word2vec to vectorize the documents. Would you care to share your code? I could have a look at what you are describing here. Best Regards, Selena
I am running document clustering using Sompy, I was following the example given along with this project. I had lists of documents. Each element in list contains text contained in respective document. So I followed following steps -
When I run the following command
som = sompy.SOMFactory.build(document_list, mapsize, mask=None, mapshape='planar', lattice='rect', normalization='var', initialization='pca', neighborhood='gaussian', training='batch', name='sompy')
mapsize
is 20x20 size ofdocument_list
is 92520x92520 I read online and people suggested using batch training and reducing the features using pca, I have done that, but still I find my RAM getting 100% utilised, (I have 126 GB RAM, 12 Core processor) and have to interrupt the program.Any help at this time will be appreciated.