DHARPA-Project / kiara_plugin.tropy

Modules and utilities for integrating tropy and kiara.
Mozilla Public License 2.0
0 stars 1 forks source link

Refactoring of 'network_data' type to 'network_graph' #1

Open makkus opened 9 months ago

makkus commented 9 months ago

Lacking an idea where to put this information, I'll put all the notes I have regarding the refactoring of the 'network_data' type (that was used in the old 'network_analysis' plugin) to the simpler 'network_graph' data type into this issue.

makkus commented 9 months ago

I've refactored the central network data type, and renamed it from network_data to network_graph, since that is more correct now. I've tried to keep its interface the same where possible, it still references two arrow tables under the hood, so can be queried the same way as the old type. Other than that, not much is the same, it basically only allows for export as a networkx graph.

As requested here, users now specify the type of the network graph when assembling the data, which means there is an additional 'graph_type' user input in the assemble operation.

makkus commented 9 months ago

I'm in the process of copying all modules from the tagung repository into this plugin, with some minor adjustments so that they don't error out.

One thing that has changed in the data type interface is the as_networkx_graph method. The old one asked for the type of graph to be specified. In the new one the type is specified at graph creation time, so this input is not necessary anymore and I have removed it. It'll be necessary for Caitlin to double check all of those occurences and make sure nothing unexpected happens because of this change. I'd also recommend writing tests for all modules, using any possible types of graphs as input, as much as makes sense.

makkus commented 9 months ago

Ok, so I've moved all of the tagung modules in here, most of them at least don't break, except for the closeness_rank_list which I might not have the right example data for, and the cutpoints one where there is an issue with a numpy data type, which might also be related to my example data (added a TODO note in the source code).

I've not looked at the module code at all, apart from refactoring the as_networkx_graph (as mentioned above), and renaming instances of network_data to network_graph.

So, I think at this point @CBurge95 could take over, make sure my changes haven't broken anything, maybe write some tests using different graph-type example network graph data inputs, and make sure the code doesn't break for unexpected inputs (or breaks but gives useful error messages).

Once that's done, maybe submit it for review to someone else, and then me. I'll have a look at the code part then.