openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
709 stars 38 forks source link

[REVIEW]: PAM: Population Activity Modeller #6097

Closed editorialbot closed 5 months ago

editorialbot commented 10 months ago

Submitting author: !--author-handle-->@fredshone<!--end-author-handle-- (Fred Shone) Repository: https://github.com/arup-group/pam Branch with paper.md (empty if default branch): joss Version: v0.3.2 Editor: !--editor-->@martinfleis<!--end-editor-- Reviewers: @jamesdamillington, @martibosch Archive: 10.5281/zenodo.10948231

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40"><img src="https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40/status.svg)](https://joss.theoj.org/papers/c8737f0649d60fffcd3ba0aa25fa5f40)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@jamesdamillington & @martibosch, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @martinfleis know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @jamesdamillington

📝 Checklist for @martibosch

editorialbot commented 10 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 10 months ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.29 s (642.4 files/s, 198646.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                         103           4789           2487          18491
JSON                             5              0              0           4770
XML                             17            182             39           4143
Jupyter Notebook                18              0          17653           2387
Markdown                        16            313              0            757
YAML                            14             37              8            478
SQL                              3              0              0            164
XSD                              1              8              2            125
DTD                              4             89            254            102
TOML                             1             24             14             82
TeX                              1              7              0             77
HTML                             1              3              1              9
CSS                              1              1              2              7
Dockerfile                       1              2              0              7
-------------------------------------------------------------------------------
SUM:                           186           5455          20460          31599
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 10 months ago

Wordcount for paper.md is 714

editorialbot commented 10 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5334/baw is OK
- 10.17226/22357 is OK
- 10.1080/12265934.2013.835118 is OK
- 10.14279/depositonce-9835 is OK
- 10.1016/j.procs.2021.03.089 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 10 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

martinfleis commented 10 months ago

👋🏼 @fredshone, @jamesdamillington, @martibosch this is the review thread for the paper. All of our communications will happen here from now on.

All reviewers should create checklists with the JOSS requirements using the command @editorialbot generate my checklist. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues (and small pull requests if needed) on the software repository. When doing so, please mention https://github.com/openjournals/joss-reviews/issues/6097 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

As agreed, I expected that the review will take a bit longer than usual, aiming at January. Please let me know if any of you require significantly more time at any point. We can also use editorialbot to set automatic reminders if you know you'll be away for a known period of time.

Please feel free to ping me (@martinfleis) if you have any questions/concerns.

Thanks!

martinfleis commented 8 months ago

Hi @jamesdamillington, @martibosch, just a minor reminder that this review is ongoing. We agreed to target end of January so there's still a bit of time but wanted to ensure it won't slip of your radar. Thanks!

jamesdamillington commented 8 months ago

Review checklist for @jamesdamillington

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

jamesdamillington commented 8 months ago

I have now reviewed this. It looks very good and as shown by the checklist I am happy it meets all the criteria and could be accepted. The tool works as expected and from the examples I have explored I believe the functional claims are supported.

One minor point (regarding the paper, not the code or implementation, so I will make it here) is that Figure 1 is not referenced in the main text nor is it explained anywhere I can see (neither the paper nor in the documentation). The figure is an illustration to the text, to demonstrate some point made in the text etc., so should be referenced. Furthermore, while the figure may be self-explanatory to the authors but it may not be to the reader. For example, I am not clear what the diagonal shading between work and shop for Persons A and B means. I assume A and B work at the same location given the brown/pink shading but I don't understand the diagonal connections. I think some general description of Figure 1 plus maybe explanation of this point would be useful (in the paper at least, but then maybe also copy the text where the Figure appears in documentation).

fredshone commented 8 months ago

@editorialbot generate pdf

editorialbot commented 8 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jamesdamillington commented 8 months ago

Thanks for editing as suggested @fredshone Looks good.

fredshone commented 8 months ago

@jamesdamillington many thanks 🙏🏻. I added a reference for the offending figure in the text and some explanation to the caption. It is a rather abstract representation of a complex data structure - please let me know if you'd like the figure reworked or removed.

martinfleis commented 7 months ago

Hey @martibosch, could you give us an estimation of when you'd be able to start? I know you mentioned you are a bit swamped but wanted to ensure we'll be able to do the review in a timely manner. Thanks!

@jamesdamillington Thanks a lot!

martinfleis commented 7 months ago

@editorialbot remind @martibosch in one week

editorialbot commented 7 months ago

Reminder set for @martibosch in one week

editorialbot commented 7 months ago

:wave: @martibosch, please update us on how your review is going (this is an automated reminder).

martinfleis commented 6 months ago

Just FYI, I had a short email exchange with @martibosch who needs a bit more time for the review.

martibosch commented 6 months ago

Review checklist for @martibosch

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

martibosch commented 6 months ago

Hello,

first of all sorry for the delay in my review. Here we go:

This is a super solid package which provides a super friendly/Pythonic interface for population activity modeling. A few years ago I was interested in activity-based population synthesis to model the impacts of the spatial organization of urban areas on travel behaviour, but I was overwhelmed due to the lack of existing Python tools. I think this library could have changed that.

I like that there are so many notebooks and that they provide a great overview of the functionalities - again, I find the library code very intuitive and the notebooks are very easy to follow. Regarding the library itself, I would only like to raise three points:

  1. The fact that it is so Pythonic and object oriented makes me wonder about performance (which you point out in the manuscript). All the notebooks show small populations, which are great for pedagogical purposes but how long would it take to perform common tasks for a population with, e.g., 1M households? For instance, in 07_travel_survey_to_matsim.ipynb, what is the size of the data, e.g., number of households, people, trips...? how long does it take to run the notebook? Overall, it would be interesting to see more real-world examples.
  2. The data structures of 06_travel_plots.ipynb remind me of scikit-mobility, I wonder if it would be interesting to mention the links or think of a more seamless integration for this part (please do not hesitate to point out if I am mistaken and there is no point in doing so).
  3. (tiny detail, feel free to ignore) Cloning the repo took some time: it uses 25 MB (even though 217 MB were downloaded, I do not know why this happens but I suppose is more related to git/github and thus out of the scope of this review) - I wonder if it would be better to put the examples in a separate repo, but this is completely up to the maintainers.

Finally, I have some manuscript corrections:

  1. The name for the last author is missing (after the "and").
  2. I believe that "Python" should always be capitalized.
  3. lines 23-25 and 28-29 are repeated. I believe a different (synonym) sentence should be used in the latter.
  4. The biggest issue I see is that the summary does not read as such: a summary should provide an overview of the problem as well as a brief overview of how the proposed tool addresses it. The latter is clearly missing.

Martí

fredshone commented 6 months ago

@editorialbot generate pdf

editorialbot commented 6 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

fredshone commented 6 months ago

Thank you for the thorough review @martibosch.

The revisions in the above paper should address your manuscript corrections:

Regarding the project:

  1. Performance

    • Certainly restricted by the OO design, but we trade for use-ability and broad applicability.
    • A millions agents is about 10G. Read time from tabular or xml is about 5 minutes.
    • There has been profiling and reworking of very commonly used features (such as io) to speed them up.
    • Functionality in the CLI uses streaming to minimize memory requirements. It is possible to built some parallelization on this streaming approach for appropriate tasks.
    • The examples are deliberately minimal as we include them (and their data) in our repo and CI pipeline. This ensures we find out when they break. I don't think it is feasible for small open source projects to long-term maintain examples that use external data. Happy to talk anyone's ear off about this 😃 .
    • We do periodically muse on performance and our next option would be to be stricter about data types and write some of the core objects in C/Rust. But this is some long way off.
  2. Scikit-mobility:

  1. Repo bloat:

Many thanks!

brynpickering commented 6 months ago

To add on project bloat: this doesn't extend to the downloadable package (< 1MB) indexed on PyPI and Anaconda.

martibosch commented 5 months ago

Hello @fredshone @brynpickering, thank you for your responses. I believe the paper can now move forward.

Just a final note regarding the real-world examples: I understand that is not practical to use large external datasets in CI. Nonetheless, I would recommend mentioning some benchmarks (e.g., the 1M agents) somewhere in the examples, documentation or readme, so that users have some reference expectations. Otherwise, users may open a relatively large dataset and wonder whether it is taking the expected time or the process is stuck.

Martí

fredshone commented 5 months ago

makes sense. I've added a note in the tabular read example. PR here. It essentially reads; "this is a small example, real datasets will take longer".

Thanks Martí

martinfleis commented 5 months ago

@editorialbot check references

martinfleis commented 5 months ago

@editorialbot generate pdf

martinfleis commented 5 months ago

@editorialbot generate post-review checklist

editorialbot commented 5 months ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@editorialbot commands

martinfleis commented 5 months ago

Post-Review Checklist for Editor and Authors

Additional Author Tasks After Review is Complete

Editor Tasks Prior to Acceptance

editorialbot commented 5 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

editorialbot commented 5 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5334/baw is OK
- 10.17226/22357 is OK
- 10.1080/12265934.2013.835118 is OK
- 10.14279/depositonce-9835 is OK
- 10.1016/j.procs.2021.03.089 is OK

MISSING DOIs

- No DOI given, and none found for title: Pandemic Activity Modifier: Intro
- No DOI given, and none found for title: European Transport Conference Papers 2022: Agent-b...
- No DOI given, and none found for title: ActivitySim: Large-Scale Agent-Based Activity Gene...

INVALID DOIs

- None
martinfleis commented 5 months ago

Thank you @jamesdamillington and @martibosch!

@fredshone please refer to the post-review tasks listed in https://github.com/openjournals/joss-reviews/issues/6097#issuecomment-2036442926 and let me know once you're done with them. Let me know if you have any questions.

fredshone commented 5 months ago

@martinfleis - slight delay for a fresh release - I have uploaded to Zenodo:

VERSION: v0.3.2 DOI: 10.5281/zenodo.10948231 URL: https://zenodo.org/doi/10.5281/zenodo.10948231

Title and authors matching paper. 🤞🏻

martinfleis commented 5 months ago

@fredshone Can you ensure that the list of authors of the Zenodo archive matches the list in the paper? As in proper authors alongside you not using custom fields Zenodo offers.

martinfleis commented 5 months ago

@editorialbot set 10.5281/zenodo.10948231 as archive

editorialbot commented 5 months ago

Done! archive is now 10.5281/zenodo.10948231

martinfleis commented 5 months ago

@editorialbot set v0.3.2 as version

editorialbot commented 5 months ago

Done! version is now v0.3.2

fredshone commented 5 months ago

@martinfleis

Can you ensure that the list of authors of the Zenodo archive matches the list in the paper? As in proper authors alongside you not using custom fields Zenodo offers.

All paper authors now alongside myself (Zenodo calls these "creators").

Hope it's ok now.

martinfleis commented 5 months ago

@fredshone Any chance to ensure the order matches as well?

fredshone commented 5 months ago

done

martinfleis commented 5 months ago

@editorialbot recommend-accept

editorialbot commented 5 months ago
Attempting dry run of processing paper acceptance...
editorialbot commented 5 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.5334/baw is OK
- 10.17226/22357 is OK
- 10.1080/12265934.2013.835118 is OK
- 10.14279/depositonce-9835 is OK
- 10.1016/j.procs.2021.03.089 is OK

MISSING DOIs

- No DOI given, and none found for title: Pandemic Activity Modifier: Intro
- No DOI given, and none found for title: European Transport Conference Papers 2022: Agent-b...
- No DOI given, and none found for title: ActivitySim: Large-Scale Agent-Based Activity Gene...

INVALID DOIs

- None
martinfleis commented 5 months ago

Thanks @fredshone! The paper is now in the hands of the editor in chief for the final checks and processing.

editorialbot commented 5 months ago

:wave: @openjournals/sbcs-eics, this paper is ready to be accepted and published.

Check final proof :point_right::page_facing_up: Download article

If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/5239, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept

fredshone commented 5 months ago

@editorialbot accept

editorialbot commented 5 months ago

I'm sorry @fredshone, I'm afraid I can't do that. That's something only eics are allowed to do.