Closed caufieldjh closed 2 years ago
Hello @caufieldjh, could you try now from the current develop
branch? I should have successfully reduced the memory peak requirements. Also, now the GraphVisualizer
accepts as input EmbeddingResult
, so you do not need to call the get_node_embedding_from_index
method anymore.
Hi @LucaCappelletti94 - I tried the current develop
version of embiggen with some SPINE embeddings on the same graph, just to get to the visualization stage faster. This time, it ran out of memory in the middle instead of complaining during loading:
>>> visualizer.fit_and_plot_all(embedding.get_node_embedding_from_index(0))
/home/harry/kg-env/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py:790: FutureWarning: The default learning rate in TSNE will change from 200.0 to 'auto' in 1.2.
warnings.warn(
/home/harry/kg-env/lib/python3.8/site-packages/sklearn/manifold/_t_sne.py:790: FutureWarning: The default learning rate in TSNE will change from 200.0 to 'auto' in 1.2.
warnings.warn(
Killed
I'm still using the get_node_embedding_from_index(0)
as this is what happens when I don't:
>>> visualizer.fit_and_plot_all(embedding)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/harry/kg-env/lib/python3.8/site-packages/embiggen/visualizations/graph_visualizer.py", line 4151, in fit_and_plot_all
node_embedding = self._get_node_embedding(
File "/home/harry/kg-env/lib/python3.8/site-packages/embiggen/visualizations/graph_visualizer.py", line 676, in _get_node_embedding
self._node_embedding_method_name = self.automatically_detect_node_embedding_method(
File "/home/harry/kg-env/lib/python3.8/site-packages/embiggen/visualizations/graph_visualizer.py", line 826, in automatically_detect_node_embedding_method
if node_embedding.dtype == "uint8" and node_embedding.min() == 0:
AttributeError: 'EmbeddingResult' object has no attribute 'dtype'
Ok, thanks! I'm fixing the second one. Could you please try to re-run the first thing and hit the stop button when you see the memory starting to climb? So we can easily identify where that happens.
Early stop from command line produces no output, and in a notebook the kernel just dies.
Hi Harry, can we try to iterate on this with the new version?
Sure! With grape-0.1.10
(ensmallen-0.8.8
and embiggen-0.11.18
), repeating the same process as above (though with DegreeSPINE
and calling visualizer.fit_and_plot_all(embedding)
) works perfectly. Thanks!
Perfect!
With
grape
0.1.0, loading a graph of 10.80M heterogeneous nodes and 30.45M heterogeneous edges works as expected but fails with a MemoryError when callingvisualizer.fit_and_plot_all
:I'm not sure what happens when this much memory is available.