openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
725 stars 38 forks source link

[PRE REVIEW]: arulespy: Exploring Association Rules and Frequent Itemsets in Python #5493

Closed editorialbot closed 1 year ago

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@mhahsler<!--end-author-handle-- (Michael Hahsler) Repository: https://github.com/mhahsler/arulespy Branch with paper.md (empty if default branch): Version: 0.1.1 Editor: Pending Reviewers: Pending Managing EiC: Daniel S. Katz

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/6e69c850478b22678f9457cf105f5f68"><img src="https://joss.theoj.org/papers/6e69c850478b22678f9457cf105f5f68/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/6e69c850478b22678f9457cf105f5f68/status.svg)](https://joss.theoj.org/papers/6e69c850478b22678f9457cf105f5f68)

Author instructions

Thanks for submitting your paper to JOSS @mhahsler. Currently, there isn't a JOSS editor assigned to your paper.

@mhahsler if you have any suggestions for potential reviewers then please mention them here in this thread (without tagging them with an @). You can search the list of people that have already agreed to review and may be suitable for this submission.

Editor instructions

The JOSS submission bot @editorialbot is here to help you find and assign reviewers and start the main review. To find out what @editorialbot can do for you type:

@editorialbot commands
editorialbot commented 1 year ago

Hello human, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 1 year ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.15 s (136.1 files/s, 340626.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                             5           4949              4          41506
TeX                              1             76              0            688
Markdown                         2            139              0            501
Python                           5            121            105            258
Jupyter Notebook                 2              0           3939            147
YAML                             3              6              5             70
INI                              1              1              0             12
JSON                             1              0              0             10
TOML                             1              0              0              3
-------------------------------------------------------------------------------
SUM:                            21           5292           4053          43195
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 1 year ago

Wordcount for paper.md is 2751

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.21105/joss.00638 is OK
- 10.32614/RJ-2017-047 is OK
- 10.1016/s0306-4379(03)00072-3 is OK
- 10.1145/1132960.1132963 is OK
- 10.1007/978-3-540-44918-8_3 is OK

MISSING DOIs

- 10.1007/978-3-642-57489-4_59 may be a valid DOI for title: Induction of Association Rules: Apriori Implementation
- 10.1145/235968.233311 may be a valid DOI for title: Mining Quantitative Association Rules in Large Relational Tables
- 10.1145/170036.170072 may be a valid DOI for title: Mining Association Rules between Sets of Items in Large Databases
- 10.1109/tkde.2003.1161582 may be a valid DOI for title: Alternative Interest Measures for Mining Associations in Databases
- 10.1145/253260.253325 may be a valid DOI for title: Dynamic Itemset Counting and Implication Rules for Market Basket Data
- 10.1145/360402.360421 may be a valid DOI for title: Algorithms for Association Rule Mining – iA General Survey and Comparison
- 10.1023/b:dami.0000040429.96086.c7 may be a valid DOI for title: Mining Non-Redundant Association Rules
- 10.1109/69.846291 may be a valid DOI for title: Scalable Algorithms for Association Mining
- 10.1145/502512.502526 may be a valid DOI for title: Empirical Bayes Screening for Multi-Item Associations
- 10.1007/3-540-31314-1_73 may be a valid DOI for title: Implications of Probabilistic Data Modeling for Mining Association Rules
- 10.1109/icdm.2002.1183923 may be a valid DOI for title: Efficient Progressive Sampling for Association Rules
- 10.1287/ijoc.15.2.208.14448 may be a valid DOI for title: Relationship-Based Clustering and Visualization for High-Dimensional Data Mining
- 10.1109/icdm.2003.1250944 may be a valid DOI for title: Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution
- 10.1145/312129.312216 may be a valid DOI for title: Pruning and summarizing the discovered associations
- 10.1016/s0306-4379(03)00072-3 may be a valid DOI for title: Selecting the right objective measure for association analysis
- 10.1109/icde.1999.754924 may be a valid DOI for title: Constraint-based rule mining in large, dense databases
- 10.1109/69.979972 may be a valid DOI for title: Finding Localized Associations in Market Basket Data
- 10.1007/s00180-007-0062-z may be a valid DOI for title: Selective Association Rule Generation
- 10.3233/ida-2007-11502 may be a valid DOI for title: New Probabilistic Interest Measures for Association Rules
- 10.1007/978-3-540-70981-7_51 may be a valid DOI for title: Building on the arules Infrastructure for Analyzing Transaction Data with R
- 10.1609/icwsm.v3i1.13937 may be a valid DOI for title: Gephi: An Open Source Software for Exploring and Manipulating Networks
- 10.1007/s10618-005-0026-2 may be a valid DOI for title: A Model-Based Frequency Constraint for Mining Associations from Transaction Data
- 10.1145/1774088.1774306 may be a valid DOI for title: A study on interestingness measures for associative classifiers
- 10.1007/s11573-016-0822-8 may be a valid DOI for title: Visualizing Association Rules in Hierarchical Groups
- 10.1201/9780429447273 may be a valid DOI for title: Interactive Web-Based Data Visualization with R, plotly, and shiny

INVALID DOIs

- None
editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

danielskatz commented 1 year ago

👋 @mhahsler - Thanks for your submission.

I have a couple of questions and a comment:

Questions: I only see about 250 lines of code in the repository. Is this correct, or am I missing something?

This and the description of the repo seem to imply that this package is a wrapper around an R library. Is this correct?

Comment: The paper is >2700 words, compared with the JOSS guidelines of about 1000 words.

In addition, you could work on the possibly missing DOIs that editorialbot suggests, but note that some may be incorrect. Please feel free to make changes to your .bib file, then use the command @editorialbot check references to check again, and the command @editorialbot generate pdf when the references are right to make a new PDF. editorialbot commands need to be the first entry in a new comment.

mhahsler commented 1 year ago

Hi @danielskatz,

Yes, the Python package wraps around several R libraries (arules (https://github.com/mhahsler/arules) and rulesViz (https://github.com/mhahsler/arulesViz)), which we have been developing for a while now. We did not want to rewrite the large amount of C/C++ code developed over the last 10 year. The Python portion of the interface code is indeed relatively short. I have read the limit of at least 300 lines. However, the code is intricate, and it took me quite a while to get it right and reduce it to the minimum needed amount. I learned the hard way that more code is not always better, especially when it comes to long-term maintenance.

The examples in the paper can be easily shortened to the required length, and the DOIs added.

Please let me know and I will shorten the current paper.

danielskatz commented 1 year ago

Thanks - I'll start a scope review on the size/effort of the code, and if that passes, then we can work on the paper issues before we start the actual review. The editors will look at the code, and should make a decision in a week or two.

danielskatz commented 1 year ago

@editorialbot query scope

editorialbot commented 1 year ago

Submission flagged for editorial review.

mhahsler commented 1 year ago

Thanks!

danielskatz commented 1 year ago

👋 @mhahsler - I'm sorry to say that after discussion amongst the JOSS editors, we have decided that this submission does not meet the substantial scholarly effort criterion for review by JOSS, primarily because it is mostly a wrapper around existing code. Please see https://joss.readthedocs.io/en/latest/submitting.html#other-venues-for-reviewing-and-publishing-software-packages for other suggestions for how you might receive credit for your work.

danielskatz commented 1 year ago

@editorialbot reject

editorialbot commented 1 year ago

Paper rejected.

mhahsler commented 1 year ago

Thanks for letting me know so quickly.

Best, Michael