scramblingbalam / IIT_in_orgs

Applying Information Integration Theory
0 stars 0 forks source link

Find a way to get over memory limits for find_all_paths in iGraph #7

Closed scramblingbalam closed 7 years ago

scramblingbalam commented 7 years ago

Neither Network library I evaluated was able to get all paths between two nodes in the small enough size complexity, 16GB - Windows 10 overhead. This means I need to find a way of taking the iGraph function and using a combination of dynamic programming and disk storage techniques to work around this.

scramblingbalam commented 7 years ago

Found an interesting module that allows for computation on datasets that are bigger than RAM, called dask http://dask.pydata.org/en/latest/index.html. Unfortunately dask arrays and bags are immutable so the iGraph function won't work since it is constantly updating a list of lists.

scramblingbalam commented 7 years ago

Trying to use MongoDB as a data store for dynamic programming

scramblingbalam commented 7 years ago

Using Mongo I've been able to recreate the results of the memory version of the function. Right now however it is only saving the adjacency list in mongo which isn't what is actually taking up all the memory. Also the function takes about 100 times longer..

scramblingbalam commented 7 years ago

Even with off loading all finished paths to Mongo there is still a Memory overflow at nodes = 100 edges = 1000 scale. My adviser said that a smaller graph would be fine so I'm closing this issue