editorialbot commented 10 months ago

Submitting author: !--author-handle-->@fredshone@martinfleis<!--end-editor-- Reviewers: @jamesdamillington, @martibosch Archive: 10.5281/zenodo.10948231

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40"><img src="https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40/status.svg)](https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@jamesdamillington & @martibosch, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @martinfleis know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @jamesdamillington

📝 Checklist for @martibosch

editorialbot commented 10 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 10 months ago

Software report:

github.com/AlDanial/cloc v 1.88  T=0.29 s (642.4 files/s, 198646.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                         103           4789           2487          18491
JSON                             5              0              0           4770
XML                             17            182             39           4143
Jupyter Notebook                18              0          17653           2387
Markdown                        16            313              0            757
YAML                            14             37              8            478
SQL                              3              0              0            164
XSD                              1              8              2            125
DTD                              4             89            254            102
TOML                             1             24             14             82
TeX                              1              7              0             77
HTML                             1              3              1              9
CSS                              1              1              2              7
Dockerfile                       1              2              0              7
-------------------------------------------------------------------------------
SUM:                           186           5455          20460          31599
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository

editorialbot commented 10 months ago

Wordcount for paper.md is 714

editorialbot commented 10 months ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5334/baw is OK
- 10.17226/22357 is OK
- 10.1080/12265934.2013.835118 is OK
- 10.14279/depositonce-9835 is OK
- 10.1016/j.procs.2021.03.089 is OK

MISSING DOIs

- None

INVALID DOIs

- None

editorialbot commented 10 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

martinfleis commented 10 months ago

👋🏼 @fredshone, @jamesdamillington, @martibosch this is the review thread for the paper. All of our communications will happen here from now on.

All reviewers should create checklists with the JOSS requirements using the command @editorialbot generate my checklist. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues (and small pull requests if needed) on the software repository. When doing so, please mention https://github.com/openjournals/joss-reviews/issues/6097 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

As agreed, I expected that the review will take a bit longer than usual, aiming at January. Please let me know if any of you require significantly more time at any point. We can also use editorialbot to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@martinfleis) if you have any questions/concerns.

Thanks!

martinfleis commented 8 months ago

Hi @jamesdamillington, @martibosch, just a minor reminder that this review is ongoing. We agreed to target end of January so there's still a bit of time but wanted to ensure it won't slip of your radar. Thanks!

jamesdamillington commented 8 months ago

Review checklist for @jamesdamillington

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/arup-group/pam?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@fredshone) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

jamesdamillington commented 8 months ago

I have now reviewed this. It looks very good and as shown by the checklist I am happy it meets all the criteria and could be accepted. The tool works as expected and from the examples I have explored I believe the functional claims are supported.

One minor point (regarding the paper, not the code or implementation, so I will make it here) is that Figure 1 is not referenced in the main text nor is it explained anywhere I can see (neither the paper nor in the documentation). The figure is an illustration to the text, to demonstrate some point made in the text etc., so should be referenced. Furthermore, while the figure may be self-explanatory to the authors but it may not be to the reader. For example, I am not clear what the diagonal shading between work and shop for Persons A and B means. I assume A and B work at the same location given the brown/pink shading but I don't understand the diagonal connections. I think some general description of Figure 1 plus maybe explanation of this point would be useful (in the paper at least, but then maybe also copy the text where the Figure appears in documentation).

fredshone commented 8 months ago

@editorialbot generate pdf

editorialbot commented 8 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jamesdamillington commented 8 months ago

Thanks for editing as suggested @fredshone Looks good.

fredshone commented 8 months ago

@jamesdamillington many thanks 🙏🏻. I added a reference for the offending figure in the text and some explanation to the caption. It is a rather abstract representation of a complex data structure - please let me know if you'd like the figure reworked or removed.

martinfleis commented 7 months ago

Hey @martibosch, could you give us an estimation of when you'd be able to start? I know you mentioned you are a bit swamped but wanted to ensure we'll be able to do the review in a timely manner. Thanks!

@jamesdamillington Thanks a lot!

martinfleis commented 7 months ago

@editorialbot remind @martibosch in one week

editorialbot commented 7 months ago

Reminder set for @martibosch in one week

editorialbot commented 7 months ago

:wave: @martibosch, please update us on how your review is going (this is an automated reminder).

martinfleis commented 6 months ago

Just FYI, I had a short email exchange with @martibosch who needs a bit more time for the review.

martibosch commented 6 months ago

Review checklist for @martibosch

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/arup-group/pam?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@fredshone) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

martibosch commented 6 months ago

Hello,

first of all sorry for the delay in my review. Here we go:

This is a super solid package which provides a super friendly/Pythonic interface for population activity modeling. A few years ago I was interested in activity-based population synthesis to model the impacts of the spatial organization of urban areas on travel behaviour, but I was overwhelmed due to the lack of existing Python tools. I think this library could have changed that.

I like that there are so many notebooks and that they provide a great overview of the functionalities - again, I find the library code very intuitive and the notebooks are very easy to follow. Regarding the library itself, I would only like to raise three points:

The fact that it is so Pythonic and object oriented makes me wonder about performance (which you point out in the manuscript). All the notebooks show small populations, which are great for pedagogical purposes but how long would it take to perform common tasks for a population with, e.g., 1M households? For instance, in 07_travel_survey_to_matsim.ipynb, what is the size of the data, e.g., number of households, people, trips...? how long does it take to run the notebook? Overall, it would be interesting to see more real-world examples.
The data structures of 06_travel_plots.ipynb remind me of scikit-mobility, I wonder if it would be interesting to mention the links or think of a more seamless integration for this part (please do not hesitate to point out if I am mistaken and there is no point in doing so).
(tiny detail, feel free to ignore) Cloning the repo took some time: it uses 25 MB (even though 217 MB were downloaded, I do not know why this happens but I suppose is more related to git/github and thus out of the scope of this review) - I wonder if it would be better to put the examples in a separate repo, but this is completely up to the maintainers.

Finally, I have some manuscript corrections:

The name for the last author is missing (after the "and").
I believe that "Python" should always be capitalized.
lines 23-25 and 28-29 are repeated. I believe a different (synonym) sentence should be used in the latter.
The biggest issue I see is that the summary does not read as such: a summary should provide an overview of the problem as well as a brief overview of how the proposed tool addresses it. The latter is clearly missing.

Martí

fredshone commented 6 months ago

@editorialbot generate pdf

editorialbot commented 6 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

fredshone commented 6 months ago

Thank you for the thorough review @martibosch.

The revisions in the above paper should address your manuscript corrections:

fixed missing author
capitalized Python
removed repetition
added a beginning and end paragraph to summary

Regarding the project:

Performance
- Certainly restricted by the OO design, but we trade for use-ability and broad applicability.
- A millions agents is about 10G. Read time from tabular or xml is about 5 minutes.
- There has been profiling and reworking of very commonly used features (such as io) to speed them up.
- Functionality in the CLI uses streaming to minimize memory requirements. It is possible to built some parallelization on this streaming approach for appropriate tasks.
- The examples are deliberately minimal as we include them (and their data) in our repo and CI pipeline. This ensures we find out when they break. I don't think it is feasible for small open source projects to long-term maintain examples that use external data. Happy to talk anyone's ear off about this 😃 .
- We do periodically muse on performance and our next option would be to be stricter about data types and write some of the core objects in C/Rust. But this is some long way off.
Scikit-mobility:

Thanks for the link. I agree there is some duplicate functionality.
I have raised this issue and will discuss with team next week.

Repo bloat:

Thanks for bringing this up. In-fact the project is 250M which is unnecessary.
Normally a download size (217M) would be compressed and significantly smaller than uncompressed rep size.
In fact we don't get much compression because we have a bloated git history.
I have detailed in this issue and will push for a clean up next week.
Pretty sure the bulk of the bloat is some big data files lurking in our history.
We actually have checks for this in our pipeline and pre-commit so don't know how these got through.

Many thanks!

brynpickering commented 6 months ago

To add on project bloat: this doesn't extend to the downloadable package (< 1MB) indexed on PyPI and Anaconda.

martibosch commented 5 months ago

Hello @fredshone @brynpickering, thank you for your responses. I believe the paper can now move forward.

Just a final note regarding the real-world examples: I understand that is not practical to use large external datasets in CI. Nonetheless, I would recommend mentioning some benchmarks (e.g., the 1M agents) somewhere in the examples, documentation or readme, so that users have some reference expectations. Otherwise, users may open a relatively large dataset and wonder whether it is taking the expected time or the process is stuck.

Martí

fredshone commented 5 months ago

makes sense. I've added a note in the tabular read example. PR here. It essentially reads; "this is a small example, real datasets will take longer".

Thanks Martí

martinfleis commented 5 months ago

@editorialbot check references

martinfleis commented 5 months ago

@editorialbot generate pdf

martinfleis commented 5 months ago

@editorialbot generate post-review checklist

editorialbot commented 5 months ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@editorialbot commands

martinfleis commented 5 months ago

Post-Review Checklist for Editor and Authors

Additional Author Tasks After Review is Complete

Double check authors and affiliations (including ORCIDs)
Make a release of the software with the latest changes from the review and post the version number here. This is the version that will be used in the JOSS paper.
Archive the release on Zenodo/figshare/etc and post the DOI here.
Make sure that the title and author list (including ORCIDs) in the archive match those in the JOSS paper.
Make sure that the license listed for the archive is the same as the software license.

Editor Tasks Prior to Acceptance

[x] Read the text of the paper and offer comments/corrections (as either a list or a pull request)
[x] Check that the archive title, author list, version tag, and the license are correct
[x] Set archive DOI with @editorialbot set <DOI here> as archive
[x] Set version with @editorialbot set <version here> as version
[x] Double check rendering of paper with @editorialbot generate pdf
[x] Specifically check the references with @editorialbot check references and ask author(s) to update as needed
[x] Recommend acceptance with @editorialbot recommend-accept

editorialbot commented 5 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

editorialbot commented 5 months ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5334/baw is OK
- 10.17226/22357 is OK
- 10.1080/12265934.2013.835118 is OK
- 10.14279/depositonce-9835 is OK
- 10.1016/j.procs.2021.03.089 is OK

MISSING DOIs

- No DOI given, and none found for title: Pandemic Activity Modifier: Intro
- No DOI given, and none found for title: European Transport Conference Papers 2022: Agent-b...
- No DOI given, and none found for title: ActivitySim: Large-Scale Agent-Based Activity Gene...

INVALID DOIs

- None

martinfleis commented 5 months ago

Thank you @jamesdamillington and @martibosch!

@fredshone please refer to the post-review tasks listed in https://github.com/openjournals/joss-reviews/issues/6097#issuecomment-2036442926 and let me know once you're done with them. Let me know if you have any questions.

fredshone commented 5 months ago

@martinfleis - slight delay for a fresh release - I have uploaded to Zenodo:

VERSION: v0.3.2 DOI: 10.5281/zenodo.10948231 URL: https://zenodo.org/doi/10.5281/zenodo.10948231

Title and authors matching paper. 🤞🏻

martinfleis commented 5 months ago

@fredshone Can you ensure that the list of authors of the Zenodo archive matches the list in the paper? As in proper authors alongside you not using custom fields Zenodo offers.

martinfleis commented 5 months ago

@editorialbot set 10.5281/zenodo.10948231 as archive

editorialbot commented 5 months ago

Done! archive is now 10.5281/zenodo.10948231

martinfleis commented 5 months ago

@editorialbot set v0.3.2 as version

editorialbot commented 5 months ago

Done! version is now v0.3.2

fredshone commented 5 months ago

@martinfleis

Can you ensure that the list of authors of the Zenodo archive matches the list in the paper? As in proper authors alongside you not using custom fields Zenodo offers.

All paper authors now alongside myself (Zenodo calls these "creators").

Hope it's ok now.

martinfleis commented 5 months ago

@fredshone Any chance to ensure the order matches as well?

fredshone commented 5 months ago

done

martinfleis commented 5 months ago

@editorialbot recommend-accept

editorialbot commented 5 months ago

Attempting dry run of processing paper acceptance...

editorialbot commented 5 months ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5334/baw is OK
- 10.17226/22357 is OK
- 10.1080/12265934.2013.835118 is OK
- 10.14279/depositonce-9835 is OK
- 10.1016/j.procs.2021.03.089 is OK

MISSING DOIs

- No DOI given, and none found for title: Pandemic Activity Modifier: Intro
- No DOI given, and none found for title: European Transport Conference Papers 2022: Agent-b...
- No DOI given, and none found for title: ActivitySim: Large-Scale Agent-Based Activity Gene...

INVALID DOIs

- None

martinfleis commented 5 months ago

Thanks @fredshone! The paper is now in the hands of the editor in chief for the final checks and processing.

editorialbot commented 5 months ago

:wave: @openjournals/sbcs-eics, this paper is ready to be accepted and published.

Check final proof :point_right::page_facing_up: Download article

If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/5239, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept

fredshone commented 5 months ago

@editorialbot accept

editorialbot commented 5 months ago

I'm sorry @fredshone, I'm afraid I can't do that. That's something only eics are allowed to do.

openjournals / joss-reviews

[REVIEW]: PAM: Population Activity Modeller #6097

Status

Reviewer instructions & questions

Checklists

Review checklist for @jamesdamillington

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @martibosch

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Post-Review Checklist for Editor and Authors

Additional Author Tasks After Review is Complete

Editor Tasks Prior to Acceptance