gephi / gephi-toolkit

Gephi Toolkit - All Gephi in a Library
https://gephi.org/toolkit
173 stars 117 forks source link

NodePartitionFilter not working #43

Open andehr opened 3 years ago

andehr commented 3 years ago

Hello,

I'm encountering an error when trying to generate a query in an automatically created Gephi project, but I can't get the NodePartitionFilter to work. Below, I'm creating a query for filtering nodes based on their modularity class.

Everything executes fine, but when I open up the Gephi project, the Node Partition Filter is on a different column than the Modularity Class, and the "parts" property is null!

Is below how I should be creating a node partition filter on the modularity class column? Thanks!

Query query = filterCtrl.createQuery(new IntersectionOperator()); // other queries are added to the intersection later
// Modularity filter
Column modularity = graphCtrl().getGraphModel().getNodeTable().getColumn(Modularity.MODULARITY_CLASS);
Partition partition = appearanceCtrl().getModel().getNodePartition(graphCtrl.getGraphModel().getGraph(), modularity);
NodePartitionFilter modularityFilter = new NodePartitionFilter(modularity, appearanceCtrl.getModel());
modularityFilter.init(graphCtrl.getGraphModel().getGraph());
modularityFilter.unselectAll();
// Select the top 4 modularity classes
for (Object part : Iterables.limit(partition.getSortedValues(), 4)) {
    modularityFilter.addPart(part);
}
// At this stage, the filter looks fine in the debugger - the column is the modularity class, and the parts property has the top 4 IDs
filterCtrl.setSubQuery(query, filterCtrl.createQuery(modularityFilter));
// Set filter as current for when project is opened.
filterCtrl.add(query);
filterCtrl.setCurrentQuery(query);

The issue seems to be that when the filter property parts is serialised (which for a partition filter over modularity class is a Set<Integer>), since the serialised value is split by \r\n characters when encoded, when Gephi attempts to deserialise that parameter in Serialization.fromText(), instead of receiving the full serialised string of the Set, it receives a part of it between the newlines. This leads to an unsuccessful deserialisation of the parts Set (returns null instead). When I create a modularity partition in the Gephi UI, it serialises/deserialises fine, obviously using the same chunked base64 scheme... so not sure why it's struggling on projects created by the toolkit. When running Gephi in the debugger, it seems to receive the full string ignoring newlines when the project file was created by the UI. I say this is the problem because when I manually set the string to the full serialised value though the toolkit wrote (using debugger), then the deserialisation is successful.

andehr commented 3 years ago

Wow that was a pain to find. I have a workaround.... The problem was that another one of my maven dependencies was pulling in a version of commons-codec which switches the chunking behaviour of the call which serialises the filter parameters: Base64.encodeBase64String(bos.toByteArray());.

I can work around this by specifically adding a dependency for commons-codec for the specific version that Gephi relies on (1.14).

In future versions of Gephi, I would suggest using one of the encode methods which explicitly sets the chunking behaviour, so it's more robust to changes in the default behaviour of commons-codec.