sailuh / kaiaulu

An R package for mining software repositories
http://itm0.shidler.hawaii.edu/kaiaulu
Mozilla Public License 2.0
18 stars 12 forks source link

model_directed_graph does not take node as parameters and network transformations does not always rely on it #242

Open carlosparadis opened 11 months ago

carlosparadis commented 11 months ago

This is an unfortunate oversight when the functionality was still being designed and the separation of concerns between network.R and graph.R were still up in the air.

As shown in the function signature:

https://github.com/sailuh/kaiaulu/blob/7566f4ef50a0cd55eff47eeade3d12f186d143f0/R/graph.R#L111

The node list is not taken as parameter. This normally does not have much impact on the analysis, since we are often interested on the edgelist (e.g. the trace between files to issues, or the dependencies among files, or the files-co-changed to a commit), but it still can lead to inconsistencies when visualizing networks (i.e. isolated nodes will not be shown).

Some transform functions also do not rely on the model_direct_graph function, such as the transform for dependencies:

https://github.com/sailuh/kaiaulu/blob/7566f4ef50a0cd55eff47eeade3d12f186d143f0/R/network.R#L411C1-L411C1

This was partially because the output of depends is already a set of nodes and edgelists. However, the function should still be used to provide consistency to graph representations in Kaiaulu in respect to the table nomenclature (i.e. from and to, instead of src_filepath and dest_filepath).

The graph code should account for both the case when a node list is provided, and when it is not to generate one from the edgelist.