xianggebenben / GraphSL

Graph Source Localization Library
MIT License
7 stars 1 forks source link

[JOSS Review] Comments on paper.md #13

Closed mbeyss closed 2 months ago

mbeyss commented 3 months ago

related to this JOSS Review

From the JOSS checklist

The summary is hard to understand for a reader, who does not know the graph diffusion. I suggest do make the purpose a bit clearer, e.g. by putting the explanation what graph diffusion models are useful for ("simulating information spread") earlier and elaborating this more.

Here the basic problem (graph source localization) is nicely introduced in a way that also non experts can understand . Figure 1 is rather useful for that. The relation to others work becomes fairly clear, as you use other peoples software in a sort of unified framework. For me, it is however not clear who the target audience is, is it developers for other algorithms for graph source localization, that want to compare their solution to others, or is it users of such algorithms, that want to find out the most suitable algorithm for their work? What I am missing for both cases is some steps how to do that, i.e. how to incorporate a new algorithm, or a new dataset. With that also the "problems the software is designed to solve" is a bit unclear, technically it solves the graph source localization problem, but it remains unclear what the exact use of GraphSL is, that other softwares in this field do not cover.

As it includes other packages, this is straight forward. However, as explained earlier, the usefulness of this (and the datasets) does not become clear

The writing is generally of good quality. A few minor remarks:

  • line 27: "up-to-date state-of-the-art" seems redundant
  • in Methods and Benchmark Datasets the past tense is used, when describing other approaches, here present tense should be used.

This looks mostly fine. I would suggest to add reference to a recent review paper or so explaining graph source localization (techniques), to help the reader to dive deeper in to the field. I feel like there are missing references for the data from table.

More general

xianggebenben commented 2 months ago

Thank you for the review. Below is our revision:

Q1. The summary is hard to understand for a reader, who does not know the graph diffusion. I suggest do make the purpose a bit clearer, e.g. by putting the explanation what graph diffusion models are useful for ("simulating information spread") earlier and elaborating this more. A1. We have revised the summary to explain the graph diffusion, which is shown as follows:

We introduce GraphSL, a new library for studying the graph source localization problem. graph diffusion and graph source localization are inverse problems in nature: graph diffusion predicts information diffusions from information sources, while graph source localization predicts information sources from information diffusions. GraphSL facilitates the exploration of various graph diffusion models for simulating information diffusions and enables the evaluation of cutting-edge source localization approaches on established benchmark datasets. The source code of GraphSL is made available at Github Repository. Bug reports and feedback can be directed to the Github issues page.

Q2. For me, it is however not clear who the target audience is, is it developers for other algorithms for graph source localization, that want to compare their solution to others, or is it users of such algorithms, that want to find out the most suitable algorithm for their work? What I am missing for both cases is some steps how to do that, i.e. how to incorporate a new algorithm, or a new dataset. With that also the "problems the software is designed to solve" is a bit unclear, technically it solves the graph source localization problem, but it remains unclear what the exact use of GraphSL is, that other softwares in this field do not cover.

A2. (a). The target audience is both developers and practical users: For developers, they can add datasets and algorithms at their will. Instructions are given in the contact section in the readme file, which is shown as follows:

We welcome your contributions! If you’d like to contribute your datasets or algorithms, please submit a pull request consisting of an atomic commit and a brief message describing your contribution.

For a new dataset, please upload it to the data folder. The file should be a dictionary object saved by pickle. It contains a key "adj_mat" with the value of a graph adjacency matrix (sprase numpy array with the CSR format).

For a new algorithm, please determine whether it belongs to prescribed methods or GNN-based methods: if it belongs to the prescribed methods, add your algorithm as a new class in the GraphSL/Prescribed.py. Otherwise, please upload it as a folder under the GraphSL/GNN folder. Typically, the algorithm should include a "train" function and a "test" function, and the "test" function should return a Metric object.

Feel free to Email me (junxiang.wang@alumni.emory.edu) if you have any questions. Bug reports and feedback can be directed to the Github issues page.

For practical users, they can utilize our GraphSL library for their proposes. We have create a Jupyter notebook tutorial.ipynb to introduce the library usages.

(b) Other softwares in this field do not support various simulations of information diffusion, and they also miss real-world benchmark datasets and state-of-the-art source localization approaches. We have added it to the paper.

Q3. As it includes other packages, this is straight forward. However, as explained earlier, the usefulness of this (and the datasets) does not become clear. A3. We have create a Jupyter notebook tutorial.ipynb to introduce the library usages.

Q4. Some typos: line 27: "up-to-date state-of-the-art" seems redundant

in Methods and Benchmark Datasets the past tense is used, when describing other approaches, here present tense should be used.

A4. Thank you for pointing them out. We have fixed them in the paper.

Q5. This looks mostly fine. I would suggest to add reference to a recent review paper or so explaining graph source localization (techniques), to help the reader to dive deeper in to the field. I feel like there are missing references for the data from table.

A5. I have added the survey paper and missing reference to the data from table.

Q6. More general comments.

A6. We have added necessary references to explain "Independent Cascade" and "Linear Threshold" , we also elaborate the formulation, enhance figure 2 and Table 1. The explanations of datasets could be found in the Readme file.

Please let me know if you have any concerns, and we are happy to address them. Thanks.

mbeyss commented 2 months ago

Thank you very much for making the changes. For me this issue is resolved