sailuh / kaiaulu

An R package for mining software repositories
http://itm0.shidler.hawaii.edu/kaiaulu
Mozilla Public License 2.0
18 stars 12 forks source link

Generalize dependencies_to_sdsmj() and gitlog_to_hdsmj() in transform_*() and graph_to_dsmj() #195

Closed leilani-reich closed 1 year ago

leilani-reich commented 1 year ago

I will update the graph.R and network.R with the graph_to_dsmj(), transform_dependencies_to_sdsmj(), and transform_gitlog_to_hdsmj() functions for milestone 3.4 of DV8 integration.

Right now, I have done the following:

instead of

  git_bipartite <- transform_gitlog_to_bipartite_network(project_git, mode ="commit-file")
  cochange_table <- bipartite_graph_projection(git_bipartite, mode = FALSE,
                                               weight_scheme_function = weight_scheme_count_deleted_nodes)

in my transform_gitlog_to_hdsmj().

I tried this on my calculator project and got the following json. april27-calc-gitlog.json.zip

I am noticing some cells where src/dest are the same, which matches the nodes with arrows pointing to themselves in the temporal network here. I also notice the graph looks directed in terms of the cell output.

Is this what you wanted me to look at for the generalization? I also ask on the issue here.

And do you want the transform_gitlog_to_hdsmj() to have the option to choose the temporal network? And if so, what should I rename the "Cochange" value to in my hdsm.json?

Thanks

leilani-reich commented 1 year ago

Hi @carlosparadis, I made some updates to my functions and added the separate transform_temporal_gitlog_to_adsmj. I also fixed how I was getting the parameters in the graph_to_dsmj. I ran the functions and they all gave what looked like expected json outputs, but I will test more thoroughly. Also, I think I am noticing issues with the parse_dependencies() function again, the filepath it takes in. I will let you know about that.

codecov[bot] commented 1 year ago

Codecov Report

Merging #195 (7503646) into master (2b10c5b) will decrease coverage by 0.05%. The diff coverage is 0.00%.

:exclamation: Current head 7503646 differs from pull request most recent head 4e3580a. Consider uploading reports for the commit 4e3580a to get more accurate results

@@            Coverage Diff            @@
##           master    #195      +/-   ##
=========================================
- Coverage    3.07%   3.03%   -0.05%     
=========================================
  Files          15      15              
  Lines        1786    1810      +24     
=========================================
  Hits           55      55              
- Misses       1731    1755      +24     
Impacted Files Coverage Δ
R/graph.R 0.00% <0.00%> (ø)
R/network.R 0.00% <0.00%> (ø)
R/parser.R 0.00% <ø> (ø)
leilani-reich commented 1 year ago

Checking output of graph_to_dsmj

I checked the output json for my transform_gitlog_to_hdsmj, transform_dependencies_to_sdsmj, and transform_temporal_gitlog_to_adsmj. From what I checked, the results look reasonable.

For transform_gitlog_to_hdsmj and transform_dependencies_to_sdsmj, I was able to use the DV8 output for comparison and saw that it aligned.

As for transform_temporal_gitlog_to_adsmj, I'm not sure if I can verify that the "Collaborate" values are correct, but I visualized the graph following similar steps to the temporal network graph shown here, and got the following:

Screenshot 2023-04-29 at 11 07 49 PM

This was done on the https://github.com/HouariZegai/Calculator git repo. Looking at the graph, I saw that the cells in my adsmj matched in terms of src authors having arrows pointing to dest authors for each cell.

json output files to check

Here are my output jsons from each of my transform functions. I have also included a dv8 output hdsm to compare to my transform_gitlog_to_hdsmj output, and a dv8 output sdsm to compare to my transform_dependencies_to_sdsmj output.

april29tests.zip