PyXplor - Group 2 - Githubissues

Submitting Author: Name (@rbouwer) All current maintainers: (@arturoboquin, @iris0614, @phchen5) Package Name: PyXplor One-Line Description of Package: A package for simplifying the EDA of different data types! Repository Link: https://github.com/UBC-MDS/PyXplor Version submitted: 2.0.0 Editor: @ttimbers
Reviewer 1: @Marcony1 - Marco Polo Bravo Montiel Reviewer 2: @salva-u Salva Umar
Reviewer 3: @joeywwwu - Joey Wu Reviewer 4: @YimengXia - Yimeng Xia

Archive: TBD JOSS DOI: TBD Version accepted: TBD Date accepted (month/day/year): TBD

Code of Conduct & Commitment to Maintain Package

[x] I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package after should it be accepted.
[x] I have read and will commit to package maintenance after the review as per the pyOpenSci Policies Guidelines.

Description

pyxplor is a comprehensive Python package designed to automate and streamline the Exploratory Data Analysis (EDA) process. Tailored for various data types including numeric, categorical, binary, and time series data, pyxplor aims to enhance data interpretation through a suite of specialized plotting functions. This package seeks to reduce the complexity and time invested in initial data analysis, making it an essential tool for data scientists and analysts at all levels.

Scope

Please indicate which category or categories. Check out our package scope page to learn more about our scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
- [ ] Data retrieval
- [ ] Data extraction
- [ ] Data processing/munging
- [ ] Data deposition
- [ ] Data validation and testing
- [x] Data visualization[^1]
- [ ] Workflow automation
- [ ] Citation management and bibliometrics
- [ ] Scientific software wrappers
- [ ] Database interoperability

Domain Specific & Community Partnerships

[ ] Geospatial
[ ] Education
[ ] Pangeo

Community Partnerships

If your package is associated with an existing community please check below:

[ ] Pangeo
- [ ] My package adheres to the Pangeo standards listed in the pyOpenSci peer review guidebook

[^1]: Please fill out a pre-submission inquiry before submitting a data visualization package.

For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
- Who is the target audience and what are scientific applications of this package?
The target audience for this package is all data scientists starting their data analysis with EDA. This package enhances data visualization for better interpretation and reduces the complexity and time invested in initial data analysis.
- Are there other Python packages that accomplish the same thing? If so, how does yours differ?

While there are several EDA packages in the Python ecosystem, such as pandas-profiling (link) and sweetviz (link), pyxplor differentiates itself by offering specialized functions for different data types. This targeted approach enables more nuanced and relevant insights, particularly for binary and time-series data which are often less catered for in existing tools. pyxplor complements these existing tools by filling these specific gaps, thus enriching the Python EDA toolkit.

If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

[x] does not violate the Terms of Service of any service it interacts with.
[x] uses an OSI approved license.
[x] contains a README with instructions for installing the development version.
[x] includes documentation with examples for all functions.
[x] contains a tutorial with examples of its essential functions and uses.
[x] has a test suite.
[x] has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

Publication Options

[ ] Do you wish to automatically submit to the Journal of Open Source Software? If so:

JOSS Checks

- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

[x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

[x] I have read the author guide.
[ ] I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

[x] Last but not least please fill out our pre-review survey. This helps us track submission and improve our peer review process. We will also ask our reviewers and editors to fill this out.

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The editor template can be found here.

The review template can be found here.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[X] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[X] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[X] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[X] Vignette(s) demonstrating major functionality that runs successfully locally.
[X] Function Documentation: for all user-facing functions.
[X] Examples for all user-facing functions.
[X] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[X] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[X] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[X] The package name
[ ] Badges for:
- [ ] Continuous integration and test coverage,
- [X] Docs building (if you have a documentation website),
- [X] A repostatus.org badge,
- [ ] Python versions supported,
- [X] Current package version (on PyPI / Conda).

NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)

[X] Short description of package goals.
[X] Package installation instructions
[X] Any additional setup required to use the package (authentication tokens, etc.)
[X] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [X] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[X] Link to your documentation website.
[X] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[X] Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:

[X] Package documentation is clear and easy to find and use.
[X] The need for the package is clear
[X] All functions have documentation and associated examples for use
[X] The package is easy to install

Functionality

[X] Installation: Installation succeeds as documented.
[X] Functionality: Any functional claims of the software been confirmed.
[X] Performance: Any performance claims of the software been confirmed.
[] Automated tests:
- [] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [X] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[X] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[X] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [X] Package supports modern versions of Python and not End of life versions.
- [X] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[X] ~The package has an obvious research application according to JOSS's definition in their submission requirements.~

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[X] A short summary describing the high-level functionality of the software
[X] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[X] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

Review Comments:

Overall the package is super useful and very well documented. It is easy to follow along and really enjoyed going through the detailed vignette. The package has nice visualisations and the readme is very nice and well-written. There are some points below for your consideration:

The tests below did not pass, maybe a relative path issue be considered in improvements:


===================================================================================== short test summary info ===================================================================================== 
FAILED tests/test_plot_binary.py::test_figure - _tkinter.TclError: Can't find a usable init.tcl in the following directories:
FAILED tests/test_plot_time_series.py::test_output_path_generation - _tkinter.TclError: Can't find a usable tk.tcl in the following directories:
================================================================================== 2 failed, 59 passed in 31.84s ==================================================================================


2. The pyxplor.py can be deleted since it does not contain anything
3. Break the Main Vignette, into smaller ones, could have a quick look, basic usage section. A longer in depth vignette going into the lengthy explorations that are possible with the package (separated) under a separate tab. Albeit the current breakdown is definitely helpful.
4. For the categorical bar plots, they could be ordered by largest to smallest to make it easier to visualise the categories. They are ordered in the facetted plots, only the first plot in the docs for categorical is not.
5. The badges are not all all there, could consider all adding the 2 missing ones ` Continuous integration and test coverage,` and `Python versions supported`

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[X] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[X] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[X] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[X] Vignette(s) demonstrating major functionality that runs successfully locally.
[X] Function Documentation: for all user-facing functions.
[X] Examples for all user-facing functions.
[X] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[X] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[X] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[X] The package name
[ ] Badges for:
- [ ] Continuous integration and test coverage,
- [X] Docs building (if you have a documentation website),
- [X] A repostatus.org badge,
- [ ] Python versions supported,
- [X] Current package version (on PyPI / Conda).

[X] Short description of package goals.
[X] Package installation instructions
[X] Any additional setup required to use the package (authentication tokens, etc.)
[X] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [X] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[X] Link to your documentation website.
[X] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[X] Citation information

Usability

[X] Package documentation is clear and easy to find and use.
[X] The need for the package is clear
[X] All functions have documentation and associated examples for use
[X] The package is easy to install

Functionality

[X] Installation: Installation succeeds as documented.
[X] Functionality: Any functional claims of the software been confirmed.
[X] Performance: Any performance claims of the software been confirmed.
[ ] Automated tests:
- [ ] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [X] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[X] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[X] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [X] Package supports modern versions of Python and not End of life versions.
- [X] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[X] A short summary describing the high-level functionality of the software
[X] Authors: A list of authors with their affiliations
[X] A statement of need clearly stating problems the software is designed to solve and its target audience.
[X] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[X] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 1h

Review Comments

The package is well organized and very convenient. I really enjoy using it to automate and streamline the exploratory data analysis process. The documentation has explained the use of this package well and provides excellent examples for each core function. Some aspects to consider for improvement are:

The pyxplor.py is suggested to be deleted as it contains nothing.
It would be better if there was more context about the dataset, and a more detailed introduction explaining the importance and applications of EDA in data science would help users understand the relevance and application of the examples.
If possible, include interactive elements or widgets in the tutorial for a hands-on experience. These interactive elements can make the learning process more engaging and effective, and help users better understand the capabilities and use of the package.
See badges example, it would be better to include more relevant badges.

Tests are passed in my end, good job!

$ pytest tests/
========================================= test session starts ==========================================
platform darwin -- Python 3.11.7, pytest-7.4.4, pluggy-1.3.0
rootdir: /Users/wenweiwu/Desktop/UBC_MDS/block4/DSCI_524/PyXplor
plugins: cov-4.1.0, anyio-4.2.0
collected 61 items                                                                                     

tests/test_plot_binary.py ..................                                                     [ 29%]
tests/test_plot_categorical.py .............                                                     [ 50%]
tests/test_plot_numeric.py ..............                                                        [ 73%]
tests/test_plot_time_series.py ................                                                  [100%]

========================================= 61 passed in 13.66s ==========================================

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[x] Vignette(s) demonstrating major functionality that runs successfully locally.
[x] Function Documentation: for all user-facing functions.
[x] Examples for all user-facing functions.
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[ ] Badges for:
- [ ] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [ ] Python versions supported,
- [x] Current package version (on PyPI / Conda).

[x] Short description of package goals.
[x] Package installation instructions
[x] Any additional setup required to use the package (authentication tokens, etc.)
[x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[x] Link to your documentation website.
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[x] Citation information

Usability

[x] Package documentation is clear and easy to find and use.
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use
[x] The package is easy to install

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[ ] Automated tests:
- [ ] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [x] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [x] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[x] A short summary describing the high-level functionality of the software
[x] Authors: A list of authors with their affiliations
[x] A statement of need clearly stating problems the software is designed to solve and its target audience.
[x] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 1h

Review Comments

Overall, I think this package is really helpful in reducing the workload in EDA. The example usage is very comprehensive and clear, enjoy reading through, good job!

Below are the feedback:

Adding the missing badges (Continuous integration and test coverage & Python versions supported)
You can delete the 'pyxplor.py' file as it is empty.
It might be beneficial to reconsider the use of colour in the Distribution of Categorical Variables. The current approach assigns colours to bars based on their count ranking, which could potentially confuse users. For example, in your example.ipynb, pickup_borough is displayed in green for Bronx, while dropoff_borough is in red, solely due to count variations for the same variable. Given that each bar already has a clear label, the additional colour coding might not be necessary and could be removed.
You can enhance the EDA experience by offering scatterplots between the target variable (if numerical) and the numerical explanatory variables, catering to users who want to visualize the relationships before creating models for predictions.
Consider EDA for ordinal variables, I think "passengers" is visualized as an ordinal variable instead of categorical variable, as you've maintained the natural order of the number of passengers instead of ranking them as you did with other categorical variables. (see example.ipynb).

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README.
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README.
[x] Vignette(s) demonstrating major functionality that runs successfully locally.
[x] Function Documentation: for all user-facing functions.
[x] Examples for all user-facing functions.
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a pyproject.toml file or elsewhere.

Readme file requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[ ] Badges for:
- [ ] Continuous integration and test coverage,
- [x] Docs building (if you have a documentation website),
- [x] A repostatus.org badge,
- [ ] Python versions supported,
- [x] Current package version (on PyPI / Conda).

[x] Short description of package goals.
[x] Package installation instructions
[x] Any additional setup required to use the package (authentication tokens, etc.)
[x] Descriptive links to all vignettes. If the package is small, there may only be a need for one vignette which could be placed in the README.md file.
- [x] Brief demonstration of package usage (as it makes sense - links to vignettes could also suffice here if package description is clear)
[x] Link to your documentation website.
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages in the scientific ecosystem.
[x] Citation information

Usability

[ ] Package documentation is clear and easy to find and use.
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use
[x] The package is easy to install

Functionality

[ ] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests:
- [x] All tests pass on the reviewer's local machine for the package version submitted by the author. Ideally this should be a tagged version making it easy for reviewers to install.
- [x] Tests cover essential functions of the package and a reasonable range of inputs and conditions.
[x] Continuous Integration: Has continuous integration setup (We suggest using Github actions but any CI platform is acceptable for review)
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines. A few notable highlights to look at:
- [x] Package supports modern versions of Python and not End of life versions.
- [x] Code format is standard throughout package and follows PEP 8 guidelines (CI tests for linting pass)

For packages also submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[x] A short summary describing the high-level functionality of the software
[x] Authors: A list of authors with their affiliations
[x] A statement of need clearly stating problems the software is designed to solve and its target audience.
[x] References: With DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[x] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 1.5h

Review Comments

pyexplor.py is empty.
Although there is a clickable button that leads you to the documentation, I would suggest adding a link in the "About" section of the repo so that less experienced users don't struggle as much when trying to find it.
After following the installation instructions, I couldn't run the tests (Heads-up, I am using a Windows computer). I got the following output in the console:

(pyxplor) Marcony1@MSI ~/OneDrive - Fundacion Universidad de las Americas Puebla/Documents/MDS/Block 4/DSCI 524/PyXplor (main)
$ ls
CHANGELOG.md  CONDUCT.md  CONTRIBUTING.md  docs/  img/  LICENSE  poetry.lock  pyproject.toml  README.md  src/  tests/

(pyxplor) Marcony1@MSI ~/OneDrive - Fundacion Universidad de las Americas Puebla/Documents/MDS/Block 4/DSCI 524/PyXplor (main)
$ pytest tests/
bash: pytest: command not found

I had to pip install pytest and then I got this:

pip install pytest

=============================================== short test summary info ===============================================
ERROR tests/test_plot_binary.py
ERROR tests/test_plot_categorical.py
ERROR tests/test_plot_numeric.py
ERROR tests/test_plot_time_series.py
ERROR tests/test_pyxplor.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 5 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================== 5 errors in 0.41s ==================================================

(pyxplor) Marcony1@MSI ~/OneDrive - Fundacion Universidad de las Americas Puebla/Documents/MDS/Block 4/DSCI 524/PyXplor (main)
$ pytest tests/ --cov
ERROR: usage: pytest [options] [file_or_dir] [file_or_dir] [...]
pytest: error: unrecognized arguments: --cov
  inifile: None
  rootdir: C:\Users\Marcony1\OneDrive - Fundacion Universidad de las Americas Puebla\Documents\MDS\Block 4\DSCI 524\PyXplor

pip install pytest-cov

---------- coverage: platform win32, python 3.11.7-final-0 -----------
Name                             Stmts   Miss  Cover
----------------------------------------------------
tests\test_plot_binary.py           94     92     2%
tests\test_plot_categorical.py      70     69     1%
tests\test_plot_numeric.py          80     79     1%
tests\test_plot_time_series.py      66     64     3%
tests\test_pyxplor.py                1      0   100%
----------------------------------------------------
TOTAL                              311    304     2%

=============================================== short test summary info ===============================================
ERROR tests/test_plot_binary.py
ERROR tests/test_plot_categorical.py
ERROR tests/test_plot_numeric.py
ERROR tests/test_plot_time_series.py
ERROR tests/test_pyxplor.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 5 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================== 5 errors in 0.37s ==================================================

That way, I could run the tests, but I got some errors.

I tried to run the code in example.ipynb and I got this:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-xxxxxxxxxxxx> in <module>
----> 1 import pyxplor
      2 from pyxplor.plot_binary import plot_binary
      3 from pyxplor.plot_categorical import plot_categorical

ModuleNotFoundError: No module named 'pyxplor'

So, apparently, pyxplor was still not available in my computer after following the instructions. I ended up using pip install pyxplor. After that, I could run the code without any issues. Also, I re-ran the tests and now all of them passed:

================================================== warnings summary ===================================================
tests\test_plot_binary.py:2
  C:\Users\Marcony1\OneDrive - Fundacion Universidad de las Americas Puebla\Documents\MDS\Block 4\DSCI 524\PyXplor\tests\test_plot_binary.py:2: DeprecationWarning:
  Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
  (to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
  but was not found to be installed on your system.
  If this would cause problems for you,
  please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466

    import pandas as pd

tests/test_plot_time_series.py::test_valid_input
  C:\Users\Marcony1\miniconda3\envs\pyxplor\Lib\site-packages\pyxplor\plot_time_series.py:104: FutureWarning: 'M' is deprecated and will be removed in a future version, please use 'ME' instead.
    ts_data = input_df.set_index(date_column).resample(freq)[column].mean()

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
===========================================
 61 passed, 2 warnings in 21.40s ===========================================

As of today, the installation instructions' title is "Installation (developers)". As a regular user, that made me think that I was looking into the wrong section and I searched for the "Installation (mortals)" section. As I didn't find one, I assumed that you chose this title as the package is still in development. However, I would remove the "(developers)" in the final version or create a regular user section and move the developer's instructions to the ReadTheDocs full documentation website.

In general, I really liked your package and I do find it pretty handy. As a more advanced user, I could definitely make things work. However, I think that if you'ld want rookie programmers to use your package, it would be helpful if you made the instructions more baby-level as well as making sure that it works in different OS's so that they won't have any problem trying to use it.

Good job overall!

UBC-MDS / software-review-2024

PyXplor - Group 2 #9

Code of Conduct & Commitment to Maintain Package

Description

Scope

Community Partnerships

Technical checks

Publication Options

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

Please fill out our survey

Editor and Review Templates

Package Review

Documentation

Usability

Functionality

For packages also submitting to JOSS

Final approval (post-review)

Review Comments:

Package Review

Documentation

Usability

Functionality

For packages also submitting to JOSS

Final approval (post-review)

Estimated hours spent reviewing: 1h

Review Comments

Package Review

Documentation

Usability

Functionality

For packages also submitting to JOSS

Final approval (post-review)

Review Comments

Package Review

Documentation

Usability

Functionality

For packages also submitting to JOSS

Final approval (post-review)

Review Comments