pyOpenSci / software-submission

Submit your package for review by pyOpenSci here! If you have questions please post them here: https://pyopensci.discourse.group/
94 stars 36 forks source link

MovingPandas: Software Submission for Review #18

Closed anitagraser closed 4 years ago

anitagraser commented 4 years ago

Submitting Author: Anita Graser (@anitagraser)
All current maintainers: Anita Graser (@anitagraser)
Package Name: MovingPandas One-Line Description of Package: Trajectory classes and functions built on top of GeoPandas Repository Link: https://github.com/movingpandas/movingpandas Version submitted: 0.2 Editor: Jenny Palomino (@jlpalomino)
Reviewer 1: Ivan Ogasawara (@xmnlab)
Reviewer 2: Martin Fleischmann (@martinfleis) Archive: DOI JOSS DOI: N/A Version accepted: v 0.3.rc1 Date accepted (month/day/year): 03/19/2020


Description

MovingPandas is a package for dealing with movement data. MovingPandas implements a Trajectory class and corresponding methods based on GeoPandas. A trajectory has a time-ordered series of point geometries. These points and associated attributes are stored in a GeoDataFrame. MovingPandas implements spatial and temporal data access and analysis functions (covered in the open access publication [0]) as well as plotting functions. A usage example is available at http://exploration.movingpandas.org,

[0] Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54. URL: https://www.austriaca.at/rootcollection?arp=0x003aba2b

Scope

* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see this section of our guidebook.

Geospatial (primary): The MovingPandas Trajectory class implements is a spatio-temporal data model for movement data.

Data visualization (secondary): The implemented plot functions enable straight-forward movement data exploration that goes beyond plotting the individual point locations by ensuring that trajectories are represented by linear segments between consecutive points.

Movement data / trajectories appear in many different scientific domains, including physics, biology, ecology, chemistry, transport and logistics, astrophysics, remote sensing, and more. For example, the provided tutorials cover the analysis of migrating birds as well as the analysis of ship movement within a port.

scikit-mobility is a similar package which is also in an early development stage and also deals with movement data. They implement TrajectoryDataFrames and FlowDataFrames on top of Pandas instead of GeoPandas. There is little overlap in the covered use cases and implemented functionality (comparing MovingPandas tutorials and scikit-mobility tutorials). MovingPandas focuses on spatio-temporal data exploration with corresponding functions for data manipulation and analysis. scikit-mobility on the other hand focuses on computing human mobility metrics, generating synthetic trajectories and assessing privacy risks.

https://github.com/pyOpenSci/software-review/issues/14

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication options

JOSS Checks - [x] The package has an **obvious research application** according to JOSS's definition in their [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements). Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [x] The package is not a "minor utility" as defined by JOSS's [submission requirements](https://joss.readthedocs.io/en/latest/submitting.html#submission-requirements): "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [x] The package contains a `paper.md` matching [JOSS's requirements](https://joss.readthedocs.io/en/latest/submitting.html#what-should-my-paper-contain) with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Code of conduct

P.S. Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

lwasser commented 4 years ago

hi @anitagraser !! thank you again for this submission. it will be on our discussion list for this thursday's pyopensci meeting! can you think of any folks who might be well suited to review this package? we will need 2 people.

anitagraser commented 4 years ago

Thank you @lwasser! I think GeoPandas developers would be a good fit.

lwasser commented 4 years ago

@jlpalomino will be the fearless editor for this submission !! And @xmnlab will be our first reviewer. We will reach out to the geopandas folks. @martinfleis would you be interested in being a second reviewer for moving pandas? please let us know!

martinfleis commented 4 years ago

@lwasser I would love to do that, but not sure how fast I'd be. What is the timeframe?

lwasser commented 4 years ago

hey @martinfleis we understand. we typically ask for a 3 week turn around on reviews. Would that timeframe work for you or is that too quick? Many thanks for responding so quickly!

martinfleis commented 4 years ago

@lwasser that seems to be doable. Count me in.

lwasser commented 4 years ago

awesome!! Thanks @martinfleis for doing this!!

jlpalomino commented 4 years ago

Editor checks:


Editor comments

Thanks @xmnlab and @martinfleis for agreeing to review MovingPandas. Please use the following resources to submit your review:

The submitting author is open to receiving issues and PRs if you want to create a review using that approach (e.g. include links to the issue and/or PR in your review).

Feel free to reach out with any questions about the review process.


Reviewers: Ivan Ogasawara (@xmnlab) and Martin Fleischmann (@martinfleis) Due date: February 7th, 2020

martinfleis commented 4 years ago

Package Review

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Functionality

Name                                               Stmts   Miss  Cover
----------------------------------------------------------------------
movingpandas/__init__.py                               6      0   100%
movingpandas/geometry_utils.py                        45      4    91%
movingpandas/overlay.py                              152     12    92%
movingpandas/tests/__init__.py                         0      0   100%
movingpandas/tests/test_geometry_utils.py             41      0   100%
movingpandas/tests/test_overlay.py                    78      0   100%
movingpandas/tests/test_trajectory.py                208      0   100%
movingpandas/tests/test_trajectory_collection.py      54      0   100%
movingpandas/trajectory.py                           298     29    90%
movingpandas/trajectory_aggregator.py                229    192    16%
movingpandas/trajectory_collection.py                113     42    63%
movingpandas/trajectory_plotter.py                    82     13    84%
----------------------------------------------------------------------
TOTAL                                               1306    292    78%

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 5


Review Comments

MovingPandas is a valuable addition to python geospatial stack. Being built on top of GeoPandas GeoDataFrames, its main classes are easy to understand, and the whole work with MovingPandas is very natural and straightforward. I had almost no issues in using it with my data, and everything works as advertised.

Initially, I was a bit confused by released versions of MovingPandas as when I started there was no release on GitHub and PyPI had different version than conda-forge. I would recommend following JOSS recommendation here and trying to keep these 3 (GitHub, PyPI, conda-forge) in sync as GitHub releases automatically send a notification to watching users.

During the review process, I have opened a couple of issues/PRs in the original repository, all linked above this post.

I am excited to see the further development of it as the latest addition of trajectory aggregator looks brilliant. I will certainly follow new releases, and once I have to work with movement data, MovingPandas will be the first choice.

anitagraser commented 4 years ago

Thanks a lot for the thorough review and great feedback, Martin! I'll work on the open issues.

I've been looking for the badge for pyOpenSci peer-review but haven't been able to locate one for ongoing peer review.

martinfleis commented 4 years ago

I've been looking for the badge for pyOpenSci peer-review but haven't been able to locate one for ongoing peer review.

That is more the question for @lwasser and @jlpalomino, I just copied review template.

jlpalomino commented 4 years ago

Thanks @martinfleis for your review.

@anitagraser I also was not able to find the badge details in our review guide, so I have made a note to look into where we provide this info.

Here is the badge for pyOpenSci peer review, with the second link being the URL to this issue: [![pyOpenSci](https://tinyurl.com/y22nb8up)](https://github.com/pyOpenSci/software-review/issues/18)

anitagraser commented 4 years ago

If fixed the remaining README issues: badges https://github.com/anitagraser/movingpandas/issues/53 and citation information https://github.com/anitagraser/movingpandas/commit/56ef608c3302f53f408f07781fbf9c30147eb1f3

The last open issue from Martin's review should be the Contribution guidelines.

xmnlab commented 4 years ago

I am planning to review MovingPandas today :)

anitagraser commented 4 years ago

Contribution guidelines are now available at https://github.com/anitagraser/movingpandas/blob/master/CONTRIBUTING.md

lwasser commented 4 years ago

@anitagraser when your package has fully passed both reviews and both reviewers are happy with your addressing their requested changes, we will ask you to add the badge to the readme!! please get in touch with any other questions. @martinfleis THANK YOU for this review!!

lwasser commented 4 years ago

one other question @anitagraser are you interested in JOSS? i see you didn't check the box. Joss only requires you to write a very short paper about the package (i can show you the earthpy example) . They accept the pyopensci technical review by default. no worries if you are not interested... but it's a nice citation to have if you are (linked to your orcid id and such).

anitagraser commented 4 years ago

@lwasser Thank you. Do I understand correctly that I should remove the pyopensci badge from the MovingPandas readme again? Is there a different badge to fulfill the requirement in the review template "the badge for pyOpenSci peer-review once it has started"?

Concerning JOSS, I have been thinking about it but wasn't sure if JOSS sees prior publications as an obstacle:

Graser, A. (2019). MovingPandas: Efficient Structures for Movement Data in Python. GI_Forum ‒ Journal of Geographic Information Science 2019, 1-2019, 54-68. doi:10.1553/giscience2019_01_s54. URL: https://www.austriaca.at/rootcollection?arp=0x003aba2b

anitagraser commented 4 years ago

@lwasser After looking at the earthpy paper you mentioned, I think there should be minimal overlap with the existing MovingPandas paper in GI_Forum. So yes, I'd like to try a JOSS submission.

Work in progress: https://github.com/anitagraser/movingpandas/tree/joss-paper

xmnlab commented 4 years ago

I will finish my review today :)

lwasser commented 4 years ago

@anitagraser that's great to hear!! ok so the process is pretty straight forward. Once the review is done and moving pandas is accepted here, I will ping our JOSS friends and they will help you through the JOSS process. this all looks great. the only challenge i had with JOSS was small formatting issues when it published to PDF. There is a nice tool however that is helpful. so when we get there ... I can help!

lwasser commented 4 years ago

Thank you @xmnlab :) we are so appreciative of your review!!

lwasser commented 4 years ago

Also you are the second package maintainer that has asked about other publications. i will email arfon tonight!! Also let me look into the badge specifics. it's not a huge deal but typically we've had people add the badge post review. I'll get back to you anita!!

lwasser commented 4 years ago

ok @anitagraser @arfon actually responded here: https://github.com/pyOpenSci/software-review/issues/16#issuecomment-581677530 on another review. he did say that if you have another citation already that you want to use to cite your software, then you probably don't need to worry about joss. i know you have already begun work on the joss paper... but see his comment and let us know what you think! thank you for your patience. we are still working through some of the details of our review process in these early stages

xmnlab commented 4 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 5h30


Review Comments

Review based on the version with the latest commit fb97166

Suggestions (optional):

PS/1: Examples for all user-facing functions is not required anymore. PS/2: new note: that it is NOT recommended to install MovingPandas from PyPI

xmnlab commented 4 years ago

@anitagraser it is a interesting project!

I have added the result of my review above. Let me know if you need any additional explanation or detail!

anitagraser commented 4 years ago

Thank you very much for the extensive feedback @xmnlab! It's very clear. I have just one question so far concerning "explain how to install movingpandas in development mode.": the README already has a section titled Development Installation with the necessary conda instructions. Are you missing something specific?

anitagraser commented 4 years ago

Status of improvements:

lwasser commented 4 years ago

Hi David!

For some reason I can’t seem to find this post but we are always looking for reviewers so please reach out to me if you have space to review another package! Moving pandas is under review currently and we do have 2 reviewers already. But we have a few other packages in the pipelines that will need reviewers.

anitagraser commented 4 years ago

@lwasser I've removed the review badge for now (after reading the last pyOpenSci meeting notes). Concerning JOSS: I think it would still be worthwhile to publish in JOSS since the current main paper is in a very specific venue with a very different audience.

jlpalomino commented 4 years ago

@anitagraser thanks for checking in about this and for your action! We are now working on a PR to the dev guide (https://github.com/pyOpenSci/dev_guide/pull/42) to address the timing of when to add the badge as well as to include the badge information for when the package is accepted.

xmnlab commented 4 years ago

@anitagraser about the "Development Installation" it explains how to prepare the environment but I didn't find the command line for install movingpandas in development mode, it could be, for example, python setup.py develop or pip install --no-deps -e . or if you want to just point to some nice documentation/tutorial that explains more about development mode you can add just a link to that page (eg.: https://python-packaging-tutorial.readthedocs.io/en/latest/setup_py.html#develop-mode) or also you can do both

anitagraser commented 4 years ago

@xmnlab Thank you! I wasn't aware of this develop mode. That's what caused the confusion, since my dev setup so far is finished when the environment is ready.

anitagraser commented 4 years ago

I've added usage examples to all user-facing functions except those that are self-explanatory (because they have either no parameters or have only simple parameters that are sufficiently described in the corresponding parameters section). I've also added direct links to the tutorials to https://movingpandas.readthedocs.io/en/latest/index.html

xmnlab commented 4 years ago

@anitagraser it looks very nice!

related to your point about the methods self-explanatory, I would add examples for each function/method including for these cases because:

1) if you want you can also integrate it with doctest (https://docs.python.org/3/library/doctest.html) in the future and it will help to test your docstring examples and 2) for some functions that return more complex data (like dataframe or series) it helps to understand better the data returned (also maybe it would be good to add more details about the data structure (eg: columns names and data types) for functions that return dataframes in the Returns section.

@lwasser do you have any thoughts about this point?

xmnlab commented 4 years ago

actually the editor for this submission is @jlpalomino. do you have any thoughts about this?

martinfleis commented 4 years ago

If I may add my thoughts here, I would say that there should be a distinction in the pyOpenSci review process between what is considered required change and what is a recommendation only. For example, I did not point out examples in docstrings as I did not feel it is a substantial issue. @xmnlab has a different approach, requiring complete examples ready for doctest.

This is a more general comment, so this issue might not be the right place for such discussion, but I believe that pyOpenSci needs to find a balance in the review process and maybe set some standards (maybe rOpenSci already deals with this?). The review should be valuable and thorough, but not onerous. I know it is hard to find the balance, especially if the submitted package would be of low quality (which is definitely not this case), but we are still in the open source world and we should acknowledge the nature of (largely free) work.

I believe that all the issues @xmnlab are valuable and will help @anitagraser. But it is a bit unclear to me if pyOpenSci review require this detail or if they should actually be recommendations, which are not in a way of publishing MovingPandas on pyOpenSci.

It may sounds as critics of @xmnlab's review, but it is not meant to be one :). I am genuinely curious about the expected level of the review as this is my first experience with pyOpenSci.

xmnlab commented 4 years ago

@martinfleis actually I am very curious about this too :) as the guideline says that all facing-user function should have examples (https://www.pyopensci.org/dev_guide/packaging/packaging_guide.html#Documentation) so using just tutorials for the examples is hard to check if all functions have an example inside. also from the user's perspective, If I want to see how to use a method, generally I will go to the API documentation not the tutorials.

examples inside the docstring will be published automatically inside the documentation (if the project generate the documentation from the code).

maybe I didn't express me well about the doctest ... I didn't want to suggest to use that .. I just want to comment that also it could help in the future .. but it is not required. the main point about the examples was the item 2.

But as required by https://www.pyopensci.org/dev_guide/packaging/packaging_guide.html#Documentation my first guess is that all public functions should have an example inside the docstrings or on the API documentation.

@jlpalomino could you guide us about this topic?

martinfleis commented 4 years ago

@xmnlab good point. I thank that after reading Good/Better/Best in this section I interpreted it as if it is covered in tutorial it is fine. But you are probably right.

anitagraser commented 4 years ago

Thank you all for working on clarifying this! I mostly looked at previously accepted projects to guide my work. For example, Pandera has usage examples but not for all user-facing functions: https://pandera.readthedocs.io/en/latest/API.html

jlpalomino commented 4 years ago

@anitagraser @martinfleis @xmnlab This is a great discussion, and I want to encourage you to continue it. I am interjecting here to say that I will discuss these comments further with @lwasser and get back to you with feedback from the editor perspective.

martinfleis commented 4 years ago

When I started my review I tried to understand what pyOpenSci expects. Based on the review guide, I felt that it tries to be rather inclusive than strict. After reading various Good/Better/Best explanations, my understanding was following:

These were (roughly) the rules I applied to my review. I would be happy to hear from @lwasser and @jlpalomino if they fit the aim of pyOpenSci. How strict do you want to be in the review?

jlpalomino commented 4 years ago

Hi @anitagraser @martinfleis @xmnlab thanks for your patience in this process.

@lwasser and I have decided that this is a good topic of conversation during the next pyOpenSci meeting, which will occur next Fri 2/28/20 at 11am MDT (details on discourse). We invite you all to attend if you are able.

jlpalomino commented 4 years ago

Hi @anitagraser @martinfleis @xmnlab - an update that I have posted on discourse ahead of the community meeting tomorrow at 11am MDT.

You have all brought up great points, and as the editor, I am looking forward to getting input from the community regarding "Good, "Better", "Best" and recommended vs required changes.

lwasser commented 4 years ago

hi @anitagraser @martinfleis @xmnlab i have added some language to the post based upon our meeting today. how much documentation is good enough? https://pyopensci.discourse.group/t/good-better-best-required-versus-recommended-changes-by-reviewers-your-feedback-is-requested/160/2 same link as the one jenny provided above but with a new post from our community call today. Do you guys have any thoughts on this? i really appreciate the feedback as we define what our standards are for the community!

anitagraser commented 4 years ago

Thank you for the update, @lwasser ! If I understand the intent of the discussion on discourse, having examples for all user-facing functions is not going to be a must. Besides trying to find the best wording for the guidelines, how should we proceed on this review? If I haven't missed anything, my work on the requested improvements is finished.

martinfleis commented 4 years ago

@anitagraser what about JOSS submission? Is it happening in the end? If you want to submit to JOSS we’ll have to review the paper and tick couple of other boxes.

anitagraser commented 4 years ago

@martinfleis Right! For JOSS, I'll still have to get a DOI.

I might bump the MovingPandas version to 0.3. (Movingpandas 0.2 works for geopandas 0.6.3. MovingPandas 0.3 works with GeoPandas 0.7)

The paper is in a dedicated brach for now: https://github.com/anitagraser/movingpandas/blob/joss-paper/paper.md

Is there anything else?

martinfleis commented 4 years ago

I've checked the paper and there does not seem to be any issue re JOSS requirements. Just few very minor notes.

I'll formally update my review post to include JOSS boxes, but I am happy with the package and all responses and recommend it for a publication on both pyOpenSci and JOSS.

anitagraser commented 4 years ago

@martinfleis Thank you for reviewing the paper! I have fixed the first two issues. I don't see how to affect the line break (third issue you mentioned) and found that this issue also appears in recently published papers, such as https://joss.theoj.org/papers/10.21105/joss.02075