openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
703 stars 36 forks source link

[REVIEW]: CluSim: a Python package for the comparison of clusterings and dendrograms #1264

Closed whedon closed 5 years ago

whedon commented 5 years ago

Submitting author: @ajgates42 (Alexander Gates) Repository: https://github.com/Hoosier-Clusters/clusim Version: v0.3.2 Editor: @VivianePons Reviewers: @pajaskowiak, @adavidzh Archive: 10.5281/zenodo.2601868

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/16ce8a38d204c773202405e1c7da518c"><img src="http://joss.theoj.org/papers/16ce8a38d204c773202405e1c7da518c/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/16ce8a38d204c773202405e1c7da518c/status.svg)](http://joss.theoj.org/papers/16ce8a38d204c773202405e1c7da518c)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@pajaskowiak, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @VivianePons know.

Please try and complete your review in the next two weeks

Review checklist for @pajaskowiak

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @adavidzh

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 5 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @pajaskowiak it looks like you're currently assigned as the reviewer for this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
whedon commented 5 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

adavidzh commented 5 years ago

I have a few fundamental comments from the pre-review which I would hope the authors can comment on.

arfon commented 5 years ago

👋 @ajgates42 - it seems like @adavidzh is waiting on your feedback on their comments here: https://github.com/openjournals/joss-reviews/issues/1192#issuecomment-457099152

ajgates42 commented 5 years ago

Thanks @adavidzh for the close reading of our work! I understand the lack of details in our first write-up made it difficult to decode the examples, we were trying to illustrate the breadth of the package which staying within the constraints of the word limits.

To improve the readability and ease of use, we have followed your suggestion and exchanged one of those examples with a simplified version of the Arxiv behavior examples. We chose to retain the random model example because it illustrates another core functionality of the package, but we tried to simplify the explanation in the writeup.

I also included a new example jupyter notebook that recreates the examples discussed in the text.

All updates can be found in the joss-reviews branch.

ajgates42 commented 5 years ago

@whedon generate pdf from branch joss-reviews

whedon commented 5 years ago
Attempting PDF compilation from custom branch joss-reviews
. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

ajgates42 commented 5 years ago

@whedon generate pdf from branch joss-reviews

whedon commented 5 years ago
Attempting PDF compilation from custom branch joss-reviews. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

adavidzh commented 5 years ago

Quick question for @VivianePons: the review checklist is addressed to @pajaskowiak. Does this mean that only they should check items or can I also do it?

VivianePons commented 5 years ago

I used the wrong command when I added the second reviewer I said "assign" instead of "add". You are both reviewers but only @pajaskowiak is listed. I'm not sure how I should fix it.

pajaskowiak commented 5 years ago

Just read the manuscript. Excelent work, easy to follow and understand. My only minor point is w.r.t. the fact that the authors mention that all measures are bounded to the [0,1] interval. This is not the case for Adjusted Rand Index (ARI), which can be negative. Other than that, I have no concerns at all.

adavidzh commented 5 years ago

@VivianePons: perhaps @whedon add @adavidzh as reviewer :smile: @ajgates42: thanks for the changes. I would not mention reviewers in the acknowledgments, but that's in my field. Other than that, nothing to add.

VivianePons commented 5 years ago

@whedon add @adavidzh as reviewer

whedon commented 5 years ago

OK, @adavidzh is now a reviewer

VivianePons commented 5 years ago

The command added you in the list but didn't create the list of checkboxes, I just did manually. Please, just check that you can check them all ;)

Thanks and sorry for the confusion!

adavidzh commented 5 years ago

@VivianePons: all good, thanks. @ajgates42: while going through the review checklist:

  1. Can you point me to the contribution guidelines? IMO, this can be as simple as adding a section to README.md.
  2. I can't quite find how the 0.3.1 version is tagged: I see that version in PyPI but there is no branch nor tag that I can find on github.
ajgates42 commented 5 years ago

@adavidzh, you can now find a contribution guidelines and code of conduct in the README. I've also tagged a new release v0.3.2 and updated the version in the init.py. The PyPI version will be updated shortly. Thanks!

adavidzh commented 5 years ago

Thanks @ajgates42.

adavidzh commented 5 years ago

@whedon set v0.3.2 as version

whedon commented 5 years ago

I'm sorry @adavidzh, I'm afraid I can't do that. That's something only editors are allowed to do.

adavidzh commented 5 years ago

I'm sorry @adavidzh, I'm afraid I can't do that. That's something only editors are allowed to do.

Hi @VivianePons, is it correct to say that the version number should change, given the additions to the repo?

VivianePons commented 5 years ago

@whedon set v0.3.2 as version

whedon commented 5 years ago

OK. v0.3.2 is the version.

VivianePons commented 5 years ago

Yes, it seems to make sense. I just updated it.

Do we have a green light from both reviewers now to accept the paper?

adavidzh commented 5 years ago

✅ from me.

pajaskowiak commented 5 years ago

Yes! Everything is fine.

VivianePons commented 5 years ago

Perfect!

@ajgates42 could you create an archive of your repo on Zenodo or figshare and generate a DOI? Thanks

ajgates42 commented 5 years ago

Great! Thank you all for the valuable feedback!

DOI: https://doi.org/10.5281/zenodo.2598682 Record: https://zenodo.org/record/2598682

On Tue, Mar 19, 2019 at 1:13 PM Viviane Pons notifications@github.com wrote:

Perfect!

@ajgates42 https://github.com/ajgates42 could you create an archive of your repo on Zenodo or figshare and generate a DOI? Thanks

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/1264#issuecomment-474475617, or mute the thread https://github.com/notifications/unsubscribe-auth/AZk97ZF-3Alie5ig59CtM8fmmcWdX7UBks5vYRq-gaJpZM4bDUad .

-- Alexander Gates Postdoctoral Research Scholar Center for Complex Network Research Northeastern University http://alexandergates.net

VivianePons commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

VivianePons commented 5 years ago

@whedon check references

whedon commented 5 years ago
Attempting to check references...
whedon commented 5 years ago

OK DOIs

- 10.1103/PhysRevE.78.046110 is OK
- 10.1103/PhysRevE.80.016118 is OK
- 10.1137/080734315 is OK
- 10.3389/fnhum.2012.00145 is OK
- 10.1109/SBRN.2012.25 is OK
- 10.3389/fnins.2013.00133 is OK
- 10.1126/sciadv.1602548 is OK

MISSING DOIs

- https://doi.org/10.1111/j.1469-8137.1912.tb05611.x may be missing for title: The distribution of the flora in the alpine zone
- https://doi.org/10.4064/cm-6-1-319-327 may be missing for title: On a certain distance of sets and the corresponding distance of functions
- https://doi.org/10.1080/01621459.1971.10482356 may be missing for title: Objective Criteria for the Evaluation of Clustering Methods
- https://doi.org/10.1016/0022-5193(78)90137-6 may be missing for title: On the similarity of dendrograms
- https://doi.org/10.1207/s15327906mbr2302_6 may be missing for title: Omega: A general formulation of the rand index of cluster recovery suitable for non-disjoint solutions
- https://doi.org/10.1126/science.286.5439.531 may be missing for title: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring
- https://doi.org/10.1016/s0920-9964(00)00042-6 may be missing for title: Differential activation of temporal cortex during sentence completion in schizophrenic patients with and without formal thought disorder
- https://doi.org/10.1111/1467-9868.00293 may be missing for title: Estimating the number of clusters in a data set via the gap statistic
- https://doi.org/10.1142/9789812799623_0002 may be missing for title: A stability based method for discovering structure in clustered data
- https://doi.org/10.1126/science.1073374 may be missing for title: Hierarchical organization of modularity in metabolic networks
- https://doi.org/10.1109/tkde.2003.1208999 may be missing for title: Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search
- https://doi.org/10.1007/978-3-540-45167-9_14 may be missing for title: Comparing clusterings by the variation of information
- https://doi.org/10.1137/s0036144502415960 may be missing for title: A Measure of similarity between graph vertices: applications to synonym extraction and web searching
- https://doi.org/10.1073/pnas.0400054101 may be missing for title: Defining and identifying communities in networks
- https://doi.org/10.1088/1742-5468/2005/09/p09008 may be missing for title: Comparing community structure identification
- https://doi.org/10.1038/nature03607 may be missing for title: Uncovering the overlapping community structure of complex networks in nature and society
- https://doi.org/10.1038/nature03288 may be missing for title: Functional cartography of complex metabolic networks
- https://doi.org/10.1145/1117454.1117461 may be missing for title: Relevance search and anomaly detection in bipartite graphs
- https://doi.org/10.1109/tpami.2005.237 may be missing for title: Clustering ensembles: models of consensus and weak partitions
- https://doi.org/10.1007/s00357-006-0017-z may be missing for title: On similarity indices and correction for chance agreement
- https://doi.org/10.1109/icdm.2006.70 may be missing for title: Fast random walk with restart and its applications
- https://doi.org/10.1109/cvpr.2006.289 may be missing for title: Spectral methods for automatic multiscale data clustering
- https://doi.org/10.1016/j.jmva.2006.11.013 may be missing for title: Comparing clusterings—an information based distance
- https://doi.org/10.1007/978-3-540-73133-7_1 may be missing for title: Structural Inference of Hierarchies in Networks
- https://doi.org/10.1103/physreve.76.046115 may be missing for title: Bipartite network projection and personal recommendation
- https://doi.org/10.1001/archpsyc.64.2.138 may be missing for title: Increased temporal and prefrontal activity in response to semantic associations in schizophrenia
- https://doi.org/10.1016/j.patrec.2006.11.010 may be missing for title: A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment
- https://doi.org/10.1088/1742-5468/2008/10/p10008 may be missing for title: Fast unfolding of communities in large networks
- https://doi.org/10.1137/1.9781611972887.11 may be missing for title: Consensus clustering algorithms: comparison and refinement
- https://doi.org/10.1007/978-3-540-88411-8_22 may be missing for title: Refining pairwise similarity matrix for cluster ensemble problem with cluster relations
- https://doi.org/10.1137/1.9781611972788.54 may be missing for title: On the dangers of cross-validation. An experimental evaluation
- https://doi.org/10.1103/physreve.77.046119 may be missing for title: Robustness of community structure in networks
- https://doi.org/10.1007/s10791-008-9066-8 may be missing for title: A comparison of extrinsic clustering evaluation metrics based on formal constraints
- https://doi.org/10.1103/physreve.80.056117 may be missing for title: Community detection algorithms: a comparative analysis
- https://doi.org/10.1007/s10115-008-0150-6 may be missing for title: Characterization and evaluation of similarity measures for pairs of clusterings
- https://doi.org/10.1080/15427951.2009.10129177 may be missing for title: Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters
- https://doi.org/10.1073/pnas.0808904106 may be missing for title: Extracting the multiscale backbone of complex weighted networks
- https://doi.org/10.1073/pnas.0903215107 may be missing for title: Stability of graph communities across time scales
- https://doi.org/10.1038/nature09182 may be missing for title: Link communities reveal multiscale complexity in networks
- https://doi.org/10.1016/j.physrep.2009.11.002 may be missing for title: Community detection in graphs
- https://doi.org/10.1016/j.patrec.2009.09.011 may be missing for title: Data clustering: 50 years beyond K-means
- https://doi.org/10.1126/science.1184819 may be missing for title: Community structure in time-dependent, multiscale, and multiplex networks
- https://doi.org/10.1016/j.patrec.2010.01.002 may be missing for title: Generalized external indexes for comparing data partitions with overlapping categories
- https://doi.org/10.1103/physreve.84.016111 may be missing for title: Assessing the consistency of community structure in complex networks
- https://doi.org/10.1007/s11634-011-0090-y may be missing for title: Correcting Jaccard and other similarity indices for chance agreement in cluster analysis
- https://doi.org/10.1109/icdm.2012.139 may be missing for title: Community-Affiliation Graph Model for Overlapping Network Community Detection
- https://doi.org/10.1038/srep00336 may be missing for title: Consensus clustering in complex networks
- https://doi.org/10.1016/j.neuroimage.2011.11.035 may be missing for title: The discovery of population differences in network community structure: new methods and applications to brain functional networks in schizophrenia
- https://doi.org/10.1038/nphys2162 may be missing for title: Communities, modules and large-scale structure in networks
- https://doi.org/10.1007/978-3-642-24466-7_4 may be missing for title: An overall index for comparing hierarchical clusterings
- https://doi.org/10.1037/e502412013-055 may be missing for title: Bayesian estimation supersedes the t test
- https://doi.org/10.1073/pnas.1221839110 may be missing for title: Efficient discovery of overlapping communities in massive networks
- https://doi.org/10.1016/j.neuroimage.2013.05.081 may be missing for title: Groupwise whole-brain parcellation from resting-state fMRI data for network node identification
- https://doi.org/10.1103/physreve.90.062805 may be missing for title: Community detection in networks: Structural communities versus ground truth
- https://doi.org/10.1093/schbul/sbu059 may be missing for title: Disrupted modular architecture of cerebellum in schizophrenia: a graph theoretic analysis
- https://doi.org/10.1145/2594454 may be missing for title: Structure and overlaps of ground-truth communities in networks
- https://doi.org/10.1073/pnas.1409770111 may be missing for title: Scalable detection of statistically significant communities and hierarchies, using message passing for modularity
- https://doi.org/10.1093/bioinformatics/btu462 may be missing for title: ASTRAL: genome-scale coalescent-based species tree estimation
- https://doi.org/10.1016/j.schres.2015.08.011 may be missing for title: Nodal centrality of functional network in the differentiation of schizophrenia
- https://doi.org/10.1088/1742-5468/2015/11/p11006 may be missing for title: Evaluating accuracy of community detection using the relative normalized mutual information
- https://doi.org/10.1103/physreve.92.062825 may be missing for title: Hierarchical mutual information for the comparison of hierarchical community structures in complex networks
- https://doi.org/10.1111/coin.12100 may be missing for title: Correction for Closeness: Adjusting Normalized Mutual Information Measure for Clustering Comparison
- https://doi.org/10.1145/2808797.2809344 may be missing for title: Is normalized mutual information a fair measure for comparing community detection methods?
- https://doi.org/10.1038/srep17994 may be missing for title: Random walk hierarchy measure: what is more hierarchical, a chain, a tree or a star?
- https://doi.org/10.1103/physreve.92.060801 may be missing for title: Modularity and the spread of perturbations in complex dynamical systems
- https://doi.org/10.1101/196840 may be missing for title: The Impact of Random Models on Clustering Similarity
- https://doi.org/10.1093/bioinformatics/btq228 may be missing for title: DendroPy: A Python library for phylogenetic computing

INVALID DOIs

- None
VivianePons commented 5 years ago

@ajgates42 it seems like some references are missing their dois. For example, I see: "The impact of random models on clustering similarity" has doi https://doi.org/10.1101/196840 which is not in the bib file. Can you check? whedon has put the list of all references without dois.

Thanks

arfon commented 5 years ago

@ajgates42 - you might also want to remove entries from your bibtex that you're not actually using in this paper?

ajgates42 commented 5 years ago

@VivianePons https://github.com/VivianePons: JMLR actually doesn't actually give a DOI yet, but I guess we can use the BioRxiv version instead.

On Tue, Mar 19, 2019 at 5:04 PM Arfon Smith notifications@github.com wrote:

@ajgates42 https://github.com/ajgates42 - you might also want to remove entries from your bibtex that you're not actually using in this paper?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/1264#issuecomment-474583783, or mute the thread https://github.com/notifications/unsubscribe-auth/AZk97aMXsnUI7P0-dyJavQNHFY8_SrTbks5vYVDQgaJpZM4bDUad .

-- Alexander Gates Postdoctoral Research Scholar Center for Complex Network Research Northeastern University http://alexandergates.net

ajgates42 commented 5 years ago

@whedon https://github.com/whedon check references

On Tue, Mar 19, 2019 at 6:14 PM Alexander Gates ajgates42@gmail.com wrote:

@VivianePons https://github.com/VivianePons: JMLR actually doesn't actually give a DOI yet, but I guess we can use the BioRxiv version instead.

On Tue, Mar 19, 2019 at 5:04 PM Arfon Smith notifications@github.com wrote:

@ajgates42 https://github.com/ajgates42 - you might also want to remove entries from your bibtex that you're not actually using in this paper?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openjournals/joss-reviews/issues/1264#issuecomment-474583783, or mute the thread https://github.com/notifications/unsubscribe-auth/AZk97aMXsnUI7P0-dyJavQNHFY8_SrTbks5vYVDQgaJpZM4bDUad .

-- Alexander Gates Postdoctoral Research Scholar Center for Complex Network Research Northeastern University http://alexandergates.net

-- Alexander Gates Postdoctoral Research Scholar Center for Complex Network Research Northeastern University http://alexandergates.net

VivianePons commented 5 years ago

@whedon check references

whedon commented 5 years ago
Attempting to check references...
whedon commented 5 years ago

OK DOIs

- 10.1007/BF01908075 is OK
- 10.1186/1471-2105-9-497 is OK
- 10.1145/1553374.1553511 is OK
- 10.1101/196840 is OK
- 10.1093/bioinformatics/btq228 is OK

MISSING DOIs

- None

INVALID DOIs

- None
VivianePons commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

VivianePons commented 5 years ago

We're still missing a doi for "Data clustering: a review" (not listed by whedon for some reason...). Other missing ones are from "misc" paper, arxiv preprint, so I guess this is ok.

Also the date on the scipy citation prints weirdly, @arfon do you know if it's the correct way to do it?

arfon commented 5 years ago

Also the date on the scipy citation prints weirdly, @arfon do you know if it's the correct way to do it?

Not sure, I think it's probably because of the bibtex entry not actually having a date range https://github.com/Hoosier-Clusters/clusim/blob/master/paper.bib#L85