Closed whedon closed 3 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @jdalzatec, @omshinde, @nuest, @martinfleis it looks like you're currently assigned to review this paper :tada:.
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1109/socialcom.2013.17 is OK
- 10.5220/0009885702940301 is OK
MISSING DOIs
- None
INVALID DOIs
- None
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
👋 @martinfleis, @omshinde, @nuest and @jdalzatec
Thank you all for volunteering as reviewers for this paper! At the top, you'll find individual checklists to work trough, please let me know if something is not clear or if you need any help.
Thanks! Just as a quick heads up, I'll likely do the review beginning of next week.
Thank you reviewing our submission, and sorry for the delay.
About mentioned missing items:
[1] Takada,M.,Kondo,N.,Hashimoto,H.:Japanesestudyonstratification,health,income,and neighborhood: study protocol and profiles of participants. Journal of epidemiology 24(4), 334–344 (2014) [2] Frank, L.D., Sallis, J.F., Saelens, B.E., Leary, L., Cain, K., Conway, T.L., Hess, P.M.: The development of a walkability index: application to the neighborhood quality of life study. British journal of sports medicine 44(13), 924–933 (2010) [3] Garau, C., Pavan, V.M.: Evaluating urban quality: Indicators and assessment tools for smart sustainable cities. Sustainability 10(3), 575 (2018) [4] Leong, M., Dunn, R.R., Trautwein, M.D.: Biodiversity and socioeconomics in the city: a review of the luxury effect. Biology Letters 14(5), 20180082 (2018) [5] Barret,N.,Duchateau,F.,Favetta,F.,Bonneval,L.:Predictingtheenvironmentofaneighbor- hood: a use case for france. In: International Conference on Data Management Technologies and Applications (DATA). pp. 294–301. SciTePress (2020)
:wave: @jdalzatec, please update us on how your review is going.
:wave: @omshinde, please update us on how your review is going.
:wave: @nuest, please update us on how your review is going.
:wave: @martinfleis, please update us on how your review is going.
It will likely take some time before I'll manage to do my review. Hard to estimate now, I haven't properly looked into the complexity of the package yet.
@omshinde, please update us on how your review is going.
It's coming along nicely. I have started reviewing it locally based on the checklist, will keep posted via updating the checklist. Thanks!
@whedon I am playing a bit with the package while going through the checklist. I'll take some more time while reviewing it locally. Thanks!
I'm sorry human, I don't understand that. You can see what commands I support by typing:
@whedon commands
@jdalzatec @martinfleis @omshinde thank you all for the updates, please let me know if you need my help in any way.
@fduchatea thanks for the update on your part, please see my answers below:
Permit individuals to create issues/file tickets against your repository: it seems that the gitlab instance of our lab requires authentification (of lab members) for creating issues, and this policy will not change. Should we switch to another platform such as gitlab.com?
To avoid having you migrate everything to another repo I propose two options:
Expand the description of the software: Should we expand the description here, in README or in the paper?
In the paper.
Expand and focus the software's research scope: Neighbourhoods are a very common concept in studies from diverse domains such as health, social sciences, or biology. For instance, Japanese researchers investigated the relationships between social factors and health by taking into account not only behavioural risks, but also housing and neighbourhood environments [1]. In a British study, authors describe how living areas have an impact on physical activities, from which they determine a walkability index at the neighbourhood level for improving future urban planning [2]. Smarts cities also consider neighbourhoods as an ideal unit division for measuring urban quality [3]. Lastly, a survey describes the luxury effect, i.e., the impact of wealthy neighbourhoods on the surrounding biodiversity [4]. However there is no clear definition of the neighbourhood environment. Our tool fills this gap by defining neighbourhoods and their environment, characteristics of these neighbourhoods and an interface for using popular machine-learning algorithms. These elements can be extended/enriched. Our tool has been currently used to measure the impact of the neighbourhood's environment when people moves in another city [5]. But it could be extended to other application domains: measuring the pollution degree in neighbourhoods, determining whether a neighbourhood is suitable as stopover for migratory birds, etc. We can reformulate and add this research scope in the paper if needed (we are already above the 1,000 words limit though).
I think we can just refactor the introduction and the statement of need to reflect the research need and scope by including some of the applications that you listed above. Could you take a first pass at it and I can help refine after that?
@galessiorob
Thanks for the hints. We have created a repository with a public tracker: https://gitlab.com/fduchate/review-repo-predihood
The first sections of the paper have been reformulated to broaden the description.
@whedon generate pdf
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
I know it is not my role to judge this but I believe that JOSS's requirement that the software must "Permit individuals to create issues/file tickets against your repository" is not related to review process but to the actual software repository. Therefore a review only solution is not enough. The reason why I think so is that a software in a private, although readable repository is not entirely open in a community sense. Using Matt Rocklin's seven stages of open software predihood
is currently on the stage 2, while my feeling about JOSS is that it requires stage 3.
But again, it is up to editorial team of JOSS to judge this.
@martinfleis That's my understanding, too. https://gitlab.liris.cnrs.fr/fduchate/predihood/issues requires me to register/sign-in with a LIRIS account, which I, or any other external users, do not have.
@fduchatea Would you be willing to switch your "main" repository to a more open platform?
@galessiorob Please advise.
Some first review comments and questions I stopped the review at this point as some things seem to be missing (test files, clarification of authorship, public issue tracker). I invite the authors to respond and kindly ask to revisit the JOSS guidelines and the review checklist.
Note to self:
source .venv/joss-predihood/bin/activate
docker run --name mongiris -it -p 27017:27017 mongo:4
docker start mongiris
I was not able to complete all steps in the README, please @fduchatea clarify - thanks!
pip install wheel
in my virtual environment to install predihooddocker exec
ed into it, installed wget, download database dump, ran mongorestoredump-iris.bin
locally after installation. I did find the file dump-dbinsee.bin
after manually downloading the mongiris-master.zip
from https://gitlab.liris.cnrs.fr/fduchate/mongiris - Is this the file?tests.py
seems to be missing anyway.@martinfleis and @nuest, thank you both for your thoughtful comments and assessments on the current state of accessibility of this software. Our submission requirements state that the software must:
@fduchatea can you change the settings on the repo so they comply with the above, and we'd also need this repo to be the source one - not the one that's restricted to having a signing for LIRIS.
CC @arfon @danielskatz
@fduchatea checking in on the aforementioned requirements, want to make sure you'll be able to provide them so we can carry on with the review. Let me know if you have any questions!
@galessiorob we understand that it is an issue not to let individuals create tickets. We have migrated to a more open platform and we are discussing about mentioned requirements. https://gitlab.com/fduchate/predihood
Thanks so much @fduchatea
@galessiorob @nuest
As previously mentioned, the new repository is now on https://gitlab.com/fduchate/predihood for better accessibility.
Following are answers to most of the mentioned points.
Contribution and authorship: there are three authors on the paper, but only two contributors on the repo - please clarify
This project was implemented during a 6 months training period. Nelly B. was supervised by F. Favetta and F. Duchateau.
Although F. Favetta has not directly performed commits, he actively participated in the supervision of Nelly, definition and ideas about algorithms and implementation details, testing/installing of Predihood, writing/reviewing the paper.
I'm hesitant to check of the "Substantial scholarly effort" as defined here. Has the software been used in other use cases than your own yet? I understand it makes addressing research challenges potentially easier, but I'd like to understand better how other researchers can load their own data into predimap, and how they can get their data out of the system again. The example you provide (find a new place where to live) is interesting, but not a scientific application IMO. What would be a showcase that answers a research question?
Predihood is the result of a 6 months work (February til August 2020) for 3 persons. Besides, it reuses mongiris (a light API for interacting with the neighbourhood database) which still required a few months of work.
As discussed with galessiorob, the tool has currently been only used in our context (tool developed and paper published in Summer 2020, which also limits its visibility).
We have added examples of potential applications in the paper: measuring the pollution degree in neighbourhoods, determining whether a neighbourhood is suitable as stopover for migratory birds, etc.
We are developing another (small) use case to explain how to load other data.
We have added an export functionality for all neighbourhoods selected on the map. In our initial context, we had to copy the results (the search was limited to several neighbourhoods) but we agree that prediction from the map should have an export functionality.
What about making the tool easier to try out by providing a Docker image? Or you could provide instructions for using MongoDB inside a container (which I prefer)
We do not have any expertise on Docker, so we will probably not be able to take into account this point in reasonable time.
Maybe some contributors may be willing to add this feature later?
I was not able to locate dump-iris.bin locally after installation. I did find the file dump-dbinsee.bin after manually downloading the mongiris-master.zip from https://gitlab.liris.cnrs.fr/fduchate/mongiris - Is this the file?
Yes, this is the database file. We have updated the README.
On the map, I can calculate classifications for some areas, but not for all (sometimes nothing happens) - is that expected? If yes, it should be communicated more clearly to the user.
We were not able to reproduce this bug. We have tested hundred of neighbourhoods, but as there are 50,000 in the database, we.
Do you have some neigbourhoods name or code (which do not produce any prediction) so that we can investigate?
Starting from the map: How can I export the calculated values? How can I compare the results of the different algorithms? Do I have to manually copy them to an external tool/spreadsheet? (Sorry if I'm missing something.
You are right, we used to manually copy results when predicting on the map. We have adding an export functionality on the cartographic interface (XSL file export).
In the training part, there is an export function of the results.
On the classifier training part. When I select both "Remove outliers" and "Remove rural", I get errors in the console but only a generic error message in the UI - more helpful error messages for the users would be important. Below, the problem is that the test size is negative, but the message is generic
Error when selecting both outliers and rural is fixed (issue due to Python 3.8).
Specific error messages have been added in the UI.
Please expand in the README how to run the tests (not just mention the file name); also, the file tests.py seems to be missing anyway.
We updated the path to the test file (predihood/tests.py) and we indicated how to run them.
I suggest to mention the license in the README - not everyone will know where to look for it
Added.
I had to also run pip install wheel in my virtual environment to install predihood
We have tested in a new (clean) virtual environment, and there was no need to install wheel.
According to documentation and Stackoverflow, it seems that wheel is packaged automatically with latest versions of pip (above 19.2).
We are working on the last points (loading data and community guidelines - although we are not sure what to write for this latter).
Thanks for the updates so far!
To me making clear how the software is useful and usable (data import/export) to others is key, because if it is limited to your own research, I personally think the corresponding scientific article is a suitable way to reference it. Just wanted to put that here as this concern might not have been clear in my previous comments.
Re. community guidelines: take a look at recent published JOSS papers and I'm sure you'll find some good examples.
I'll wait for you to complete your changes and will revisit my review then.
Just a ping here - I have opened an issue https://gitlab.com/fduchate/predihood/-/issues/1 since the installation following the instructions failed in my case. Therefore I am waiting for this to be resolved.
We are refactoring the code to facilitate import of new data. We will check the issue later.
Hi @fduchatea! I hope you had a nice holiday break 🎄
Checking in on several things so the reviewers can keep doing their work; some of the major issues to work on from your side:
Let me know if I can help clarify any of this!
Hi @galessiorob Thank you, and we wish an Happy New Year to all of you.
We have added a new (fake) dataset (prediction works on it), a CSV export functionality and the Community guidelines (based on the links you provided, thanks). The README has been updated to describe how to use/import another dataset.
The updated code is still on a dev branch, we still perform some tests). We should merge the branch at the beginning of next week and will tell you.
Thanks, @fduchatea !
We have merged into the master branch. We are still correcting a few bugs, but it is possible to load and predict for another dataset. The README has been updated to reflect these changes.
Thank you @fduchatea
@jdalzatec, @omshinde, @nuest, @martinfleis please resume your reviews at your earliest convenience 🙏 Let me know if there's anything I can help with.
New bugs have been corrected this week:
@fduchatea Are you currently actively developing and fixing bugs? I don't want to play catch with the developments for the review. (AFAIK, the JOSS submission should happen for rather stable pieces of software.)
@nuest No we are not in active development. We refactored the code 2 weeks ago to facilitate the use of new datasets (code merged around Jan 20th) and we are not developing anymore. Last week we noticed a few bugs so we corrected them, but the application is usable.
@jdalzatec, @omshinde, @nuest, @martinfleis I recognize that this paper has elongated past the ideal time, if you can, please resume your reviews. Let me know if you have a conflict.
Thanks!
@galessiorob Hi Gabby, I will check it ASAP. Thanks!
Hi @galessiorob ! Thanks for the reminder. I will submit my reviews as early as possible.
The major blocker on my side is the installation (xref https://gitlab.com/fduchate/predihood/-/issues/1). I was not able to install and run predihood
so far despite several attempts. The application of this kind would hugely benefit from a docker container which would directly start the app on docker run
.
@whedon generate pdf
PDF failed to compile for issue #2805 with the following error:
Can't find any papers to compile :-(
@whedon generate pdf
@galessiorob - I updated the repository address above to point to https://gitlab.com/fduchate/predihood (I think that's correct)
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
@arfon thank you!
Hi @fduchatea and authors! cc: @galessiorob
System settings while reviewing:
Installation - Local Machine
OS - Ubuntu 20.04 (WSL)
Python version - 3.8.5
Thank you for submitting a wonderful application, I foresee Predihood's utility as a great application for predicting insights about a (the) neighborhood (s). Please find my thoughts below regarding the points mentioned in the review checklist:
mongiris
package following the installation instructions here. It would be much convenient to use docker VM for setting up Predihood as already mentioned in one of the comments above.
So, the installation needs to be checked and reviewed comprehensively. python3 main.py
. Please follow the error log below and please correct me if I am following any step incorrectly. The error is possibly due to the missing configs
directory as mentioned here:
(predihood) rajat@rajat-infinity:/mnt/d/JOSS Reviews/predihood/predihoodClone/predihood$ python3 main.py
WARNING:__main__:No parameter provided, loading default dataset configuration file (configs/config_hil.json)
Traceback (most recent call last):
File "main.py", line 343, in <module>
dataset_config = load_dataset_config(dataset_config_file)
File "/mnt/d/JOSS Reviews/predihood/predihoodClone/predihood/utility_functions.py", line 537, in load_dataset_config
with open(json_file_path) as data_file:
FileNotFoundError: [Errno 2] No such file or directory: 'configs/config_hil.json'
Though python3 main.py datasets/hil/config.json
as described here worked perfectly.
The examples worked like charm. I am adding some screenshots below in the Other section, verifying the functionality. Also, I really liked the documentation generated using pdoc
but it could be more convenient if the entire document is hosted online (maybe using GitHub pages) as currently, it is required to open the HTML
files manually from the local directory.
Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
python3 tests.py
, I received the following error. It is possibly due to missing config and dataset files inside the dataset
directory. I tried it again by copying the csv
and config
file from the hil
directory to one level up here in my working environment but the same error sustained. The authors are recommended to verify it.
(predihood) rajat@rajat-infinity:/mnt/d/JOSS Reviews/predihood/predihoodClone/predihood$ python3 tests.py
test_add_assessment_to_file (__main__.TestCase) ... ok
test_address_to_city (__main__.TestCase) ... ok
test_address_to_code (__main__.TestCase) ... ok
test_get_classifier (__main__.TestCase) ... ok
test_get_most_frequent (__main__.TestCase) ... ok
test_indicator_full_to_short_label (__main__.TestCase) ... ERROR
test_indicator_short_to_full_label (__main__.TestCase) ... ERROR
test_intersection (__main__.TestCase) ... ok
test_set_classifier (__main__.TestCase) ... ok
test_signature (__main__.TestCase) ... ok
test_similarity (__main__.TestCase) ... ok
test_train_test_percentages (__main__.TestCase) ... ok
test_union (__main__.TestCase) ... ok
test_values_dataset (__main__.TestCase) ... ERROR
Traceback (most recent call last): File "tests.py", line 43, in test_indicator_full_to_short_label short_label = indicator_full_to_short_label(full_label) File "/mnt/d/JOSS Reviews/predihood/predihoodClone/predihood/utility_functions.py", line 113, in indicator_full_to_short_label indicators = model.get_indicators_dict() File "/mnt/d/JOSS Reviews/predihood/predihoodClone/predihood/model.py", line 153, in get_indicators_dict list_indicators = db.find_all(db.collection_indicators) AttributeError: 'NoneType' object has no attribute 'find_all'
Traceback (most recent call last): File "tests.py", line 49, in test_indicator_short_to_full_label full_label = indicator_short_to_full_label(short_label) File "/mnt/d/JOSS Reviews/predihood/predihoodClone/predihood/utility_functions.py", line 132, in indicator_short_to_full_label indicators = model.get_indicators_dict() File "/mnt/d/JOSS Reviews/predihood/predihoodClone/predihood/model.py", line 153, in get_indicators_dict list_indicators = db.find_all(db.collection_indicators) AttributeError: 'NoneType' object has no attribute 'find_all'
Traceback (most recent call last): File "tests.py", line 145, in test_values_dataset dataset = pd.read_csv(filename) File "/mnt/d/JOSS Reviews/predihood/predihood/lib/python3.8/site-packages/pandas/io/parsers.py", line 688, in read_csv return _read(filepath_or_buffer, kwds) File "/mnt/d/JOSS Reviews/predihood/predihood/lib/python3.8/site-packages/pandas/io/parsers.py", line 454, in _read parser = TextFileReader(fp_or_buf, kwds) File "/mnt/d/JOSS Reviews/predihood/predihood/lib/python3.8/site-packages/pandas/io/parsers.py", line 948, in init self._make_engine(self.engine) File "/mnt/d/JOSS Reviews/predihood/predihood/lib/python3.8/site-packages/pandas/io/parsers.py", line 1180, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/mnt/d/JOSS Reviews/predihood/predihood/lib/python3.8/site-packages/pandas/io/parsers.py", line 2010, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: [Errno 2] No such file or directory: '../predihood/datasets/data_density.csv'
Ran 14 tests in 7.446s
FAILED (errors=3)
3. Software paper
- The software paper contains some spelling errors and typos. The authors are recommended to carefully review the software paper for the following issues:
- Incorrect Bibtex for reference 2
Replace `livehoods` -> `Livehoods`
- Typo
Section: Methodology (citing from the paper)
`Predihood provides the following functionnalities: - adding new neighbourhoods and indicators to describe them; - predict the environment of a neighbourhood by configuring and usingpredefined algorithms; - adding new predictive algorithms.`
- Replace `functioannalities` -> `functionalities`
- Bulleted points are not rendered properly
Section: Adding new data
`It includes about 50,000 neighbourhoods with 640 indicators, and 270 neighbouhoods were` ...
- `and 270 neighbouhoods` -> `and 270 neighbourhoods`
Section: Predicting environment
- `optionnaly` -> `optionally`
4. Other
Below are the screenshots:
- Main screen
![1](https://user-images.githubusercontent.com/21292545/110232195-5a086200-7f42-11eb-86cf-a8cc31345ea8.png)
- Cartographic Interface of Predihood
![2](https://user-images.githubusercontent.com/21292545/110232196-5bd22580-7f42-11eb-8ebd-4eff9f035c70.png)
- Results obtained after executing Random Forest Classifier
![3](https://user-images.githubusercontent.com/21292545/110232198-5d035280-7f42-11eb-9375-d36822796da1.png)
- Visualization of Confusion Matrix
![4](https://user-images.githubusercontent.com/21292545/110232199-5ffe4300-7f42-11eb-9775-c3cc4671b298.png)
My remarks:
I would like to congratulate the authors for their efforts and for developing a neat software package. I envisage that Predihood would be very much useful and finds extended applications in various domains along with the core objective of predicting insights about neighborhoods. I would recommend the authors review the above feedback, especially the existing issues with the installation.
Kind regards,
Rajat
Submitting author: @fduchatea (Fabien Duchateau) Repository: https://gitlab.com/fduchate/predihood Version: v1.1 Editor: @galessiorob Reviewer: @jdalzatec, @omshinde, @nuest, @martinfleis Archive: 10.5281/zenodo.4737729
:warning: JOSS reduced service mode :warning:
Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@jdalzatec & @omshinde & @nuest & @martinfleis, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @galessiorob know.
✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨
Review checklist for @jdalzatec
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @omshinde
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @nuest
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @martinfleis
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper