editorialbot commented 7 months ago

Submitting author: !--author-handle-->@kmcdono2@emdupre<!--end-editor-- Reviewers: @geekysquirrel, @PipGrylls, @jordibc Archive: 10.5281/zenodo.13643609

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/96e2d51da2c283a40c38101856f1f15c"><img src="https://joss.theoj.org/papers/96e2d51da2c283a40c38101856f1f15c/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/96e2d51da2c283a40c38101856f1f15c/status.svg)](https://joss.theoj.org/papers/96e2d51da2c283a40c38101856f1f15c)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@geekysquirrel & @PipGrylls & @jordibc, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @emdupre know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @PipGrylls

📝 Checklist for @jordibc

📝 Checklist for @geekysquirrel

editorialbot commented 7 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 7 months ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1145/3557919.3565812 is OK
- 10.5281/ZENODO.7147906 is OK
- 10.1093/jvcult/vcab009 is OK
- 10.1016/j.regsciurbeco.2021.103711 is OK
- 10.48550/ARXIV.2101.12478 is OK
- 10.21105/joss.01800 is OK

MISSING DOIs

- None

INVALID DOIs

- None

editorialbot commented 7 months ago

Software report:

github.com/AlDanial/cloc v 1.90  T=1.79 s (68.6 files/s, 22827.5 lines/s)
-----------------------------------------------------------------------------------
Language                         files          blank        comment           code
-----------------------------------------------------------------------------------
Python                              39           2726           5086           9424
CSV                                  9              0              0           1406
reStructuredText                    24           1274            792           1329
Jupyter Notebook                     9              0          16149            968
Markdown                            13            168              0            447
YAML                                19             80            125            442
JSON                                 4              0              0            281
TeX                                  1              9              0            103
DOS Batch                            1              8              1             26
TOML                                 1              1              2             17
Sass                                 1              3              3             12
make                                 1              4              7              9
Windows Resource File                1              0              0              4
-----------------------------------------------------------------------------------
SUM:                               123           4273          22165          14468
-----------------------------------------------------------------------------------

Commit count by author:

   654  Rosie Wood
   139  kasra-hosseini
    67  Kasra Hosseini
    63  Kalle Westerling
    51  Andy Smith
    48  Katie McDonough
    28  allcontributors[bot]
    17  Daniel C.S. Wilson
    11  ChristinaLast
     7  Rosie
     3  Ubuntu
     1  Evangeline Corcoran

editorialbot commented 7 months ago

Paper file info:

📄 Wordcount for paper.md is 1160

✅ The paper includes a Statement of need section

editorialbot commented 7 months ago

License info:

🟡 License found: Other (Check here for OSI approval)

editorialbot commented 7 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

emdupre commented 7 months ago

👋 Hi @geekysquirrel @PipGrylls @jordibc, and thank you again for agreeing to review this submission for MapReader !

The review will take place in this issue, and you can generate your individual reviewer checklists by asking editorialbot directly with @editorialbot generate my checklist.

In working through the checklist, you're likely to have specific feedback on MapReader. Whenever possible, please open relevant issues on the software repository (and cross-link them with this issue) rather than discussing them here. This helps to make sure that feedback is translated into actionable items to improve the software !

If you aren't sure how to get started, please see the Reviewing for JOSS guide -- and, of course, feel free to ping me with any questions !

PipGrylls commented 7 months ago

Review checklist for @PipGrylls

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/Living-with-machines/MapReader?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@kmcdono2) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

PipGrylls commented 7 months ago

@editorialbot generate pdf

editorialbot commented 7 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

PipGrylls commented 7 months ago

I will pause my review at this point as the Install, Functionality, and Documentation prevent me from using the software or running tests.

kmcdono2 commented 7 months ago

I will pause my review at this point as the Install, Functionality, and Documentation prevent me from using the software or running tests.

@emdupre - is this a prompt for us to work on? I'm unclear what the blockers are specifically.

PipGrylls commented 7 months ago

Hi @kmcdono2 See these two issues: https://github.com/Living-with-machines/MapReader/issues/365 https://github.com/Living-with-machines/MapReader/issues/364

kmcdono2 commented 7 months ago

Perfect, thanks!

PipGrylls commented 7 months ago

Finally, regarding Contribution and authorship: Has the submitting author made major contributions to the software? Does the full list of paper authors seem appropriate and complete? I note that in the paper last two authors listed are the people with the highest level of contribution to the GH, whilst lines of code is not always equal to contribution I would like to see a justification of this author list ordering and an acknowledgment of developer roles.

kmcdono2 commented 7 months ago

Thanks for this @PipGrylls - we can include a note about this and also possibly re-order names (I don't think we paid this much attention, and of course we should).

We also use these 2 spaces in the repo to document roles: https://github.com/Living-with-machines/MapReader?tab=readme-ov-file#contributors & https://github.com/Living-with-machines/MapReader/blob/main/contributors.md, if it helps in the meantime.

jordibc commented 7 months ago

Review checklist for @jordibc

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/Living-with-machines/MapReader?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@kmcdono2) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

emdupre commented 7 months ago

👋 Hi everyone, happy Monday !

@geekysquirrel I just wanted to check-in, as I noticed you have not yet created your reviewer checklist. If you are hitting any issues in doing so, please don't hesitate to let me know !

I also wanted to explicitly carry over your note from the pre-review issue :

Hi all, not sure where the best place to say this is but I've had an initial play and I've run into some issues with the installation that required a bit of fiddling so I think the other reviewers might appreciate a heads up:

The README suggests to install using python 3.8, which works for the most part but when it gets to installing Cartopy, it fails because the latest version requires minimum 3.9. I know it's not a dependency but without it, some of the worked examples are not as useful, so I reckon it should be installed. I originally tried with 3.11 which didn't work but I could get things running with 3.9.

Also I think it'd be good to update the docs to say that some of the examples (and indeed the tool and its dependencies) require quite a bit of disk space and downloads, so in case your machine is a bit older and/or you're not on a fast connection I'd set some time aside to run the downloads - one of them took almost an hour for me.

I believe this is now being discussed in https://github.com/Living-with-machines/MapReader/issues/364#issue-2168770591, but please feel free to either chime in there or open an additional issue as appropriate !

geekysquirrel commented 6 months ago

Thanks @emdupre and no, I'm not hitting any issues - I just don't have the bandwidth to do this during the week so I'm only check in on weekends. Last weekend when I checked there were some issues so I'm picking things up today.

geekysquirrel commented 6 months ago

Review checklist for @geekysquirrel

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/Living-with-machines/MapReader?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[x] Contribution and authorship: Has the submitting author (@kmcdono2) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[x] Installation: Does installation proceed as outlined in the documentation?
[x] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[x] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

geekysquirrel commented 6 months ago

So I've started the review but haven't finished it because I encountered a few bumps in the road. The issue related to installation is still open, so I'll delay the rest of my review until they have been resolved and I can test installation properly.

I've opened https://github.com/Living-with-machines/MapReader/issues/369, https://github.com/Living-with-machines/MapReader/issues/370 and https://github.com/Living-with-machines/MapReader/issues/371.

In the meantime here are a few thoughts that might be useful to the authors but I didn't feel they should be tickets:

Contribution and authorship Judging from the contributions, a large chunk of the software was written by @rwood-97. Of course I appreciate that junior researchers often do the legwork while senior researchers hold the patronage, I feel that for a journal in which the software is the most important part, @rwood-97 should be higher up the authors list.

Automated tests Tests also take a long time (10 minutes). I would recommend to mock external calls where possible. It would be more kind to people with slow connections.

kmcdono2 commented 6 months ago

@geekysquirrel - thanks, we are working on the authorship issue to highlight the immense work from Rosie (we have dealt with this differently for different pubs, we just need to put our heads together and come up with our plan for this one).

For install, this is essentially resolved, just awaiting a final confirmation from @PipGrylls. You should be able to proceed.

PipGrylls commented 6 months ago

Hi,Apologies please don’t wait on me as I am very busy for the next two weeks.I will come back to this end of March early April but I don’t have the capacity to check your install. I will try again to install and run tests once I am less fully loaded.Best,PipSent from my iPadOn 18 Mar 2024, at 13:55, Katie McDonough @.***> wrote: @geekysquirrel - thanks, we are working on the authorship issue to highlight the immense work from Rosie (we have dealt with this differently for different pubs, we just need to put our heads together and come up with our plan for this one). For install, this is essentially resolved, just awaiting a final confirmation from @PipGrylls. You should be able to proceed.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

emdupre commented 6 months ago

Hi all,

I've followed up on https://github.com/Living-with-machines/MapReader/issues/364 with my understanding of the current status.

If there are any other blockers, of course, please don't hesitate to know ! And thank you to @PipGrylls @geekysquirrel @jordibc for your time so far.

PipGrylls commented 6 months ago

I’m happy to tick off the installation with the caveats that the minimum python version be bumped to 3.9 and the troubleshooting be updated to include the exports that I linked in the issue

I have also ticked off the tests, which run with warnings that should be addressed asap. Lots of deprecation warning from numpy and pandas warnings https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access. Not strictly blocking but also not good.

Currently the documentation for worked examples is unclear. The docs link to this. I think a README here explaining how to launch the examples and a list of examples is whats needed. Running the annotation notebooks and the worked_examples/geospatial/classification_one_inch_maps/Pipeline.ipynb notebook seems like this is well done with just some signposting needed.

PipGrylls commented 6 months ago

As far a functionality goes the sphinx docs seem comprehensive and the pipeline shoes everything in one place. I haven’t tested everything exhaustively but it satisfies a cursory pass. So I’ve checked that off.

geekysquirrel commented 6 months ago

I've been busy over Easter but aim to have another go at the remaining points using updated code with the new changes over the weekend.

jordibc commented 6 months ago

@editorialbot generate pdf

editorialbot commented 6 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

geekysquirrel commented 6 months ago

I've had another go and some things still don't fully work. I'll follow @PipGrylls's example and tick them off on condition that they're addressed before publication:

tests still fail but they do exist and run and once that's fully fixed I'm happy
Installation works, provided python 3.9 is used. The docs still say 3.8 which should be a quick and easy fix.

I've had a bit of a play with the tool and it's a cool project that I'm sure helps those who work in the area to make their job easier. Also thanks @kmcdono2 for taking on board my feedback on the authors list and congratulations to @rwood-97 to the "promotion" 🥳

jordibc commented 6 months ago

Hi!

I've finished looking into all the parts and checked all the boxes since I find everything basically satisfactory. MapReader feels very round and mature. I'd like to comment on some minor things that it'd be nice to address though.

Installation

As @PipGrylls and @geekysquirrel mentioned, it wouldn't work with python 3.8 for me either. (I added some comments about it in https://github.com/Living-with-machines/MapReader/issues/364 .) When using python 3.9 or 3.10 it worked. I second the request to see it fixed in the doc.

Tests

With respect the automated tests, maybe I missed it, but other than in https://mapreader.readthedocs.io/en/latest/Contribution-guide/Code.html I couldn't find how to run them. I think it may be better to have them mentioned in the main README.md too (not just for development, since they are useful to check that one has a working installation), and probably mentioning that pip install "mapreader[geo,dev]" would get all that is needed for a full installation.

Also, even if they all pass with no errors (tried with both with python 3.9 and 3.10), they do produce quite a few warnings (3775). It would also be nice if external calls could be mocked, as @geekysquirrel mentioned.

Notebooks

With respect to the notebooks, I agree with @PipGrylls that some signposting would be very valuable.

They all run correctly, though I needed to do a couple of minor changes. In worked_examples/geospatial/classification_one_inch_maps/Pipeline.ipynb, I had to change ./models_tutorial/checkpoint_10.pkl -> ./models_tutorial/checkpoint_8.pkl. In worked_examples/non-geospatial/classification_plant_phenotype/Pipeline.ipynb I had to change patch-300-300-350-350-#2014-06-06_plant001_rgb.png#.png -> patch-200-200-250-250-#2014-06-06_plant001_rgb.png#.png. And in classification_plant_phenotype I had to change the file phenotype_#kasra#-[hash] with a different hash value. (There may have been very few other minor changes I had to do at some point too.)

I also noticed that a couple of csv files are actually tab-separated (so, tsv):

./non-geospatial/classification_plant_phenotype/annotations_phenotype_open_access/phenotype_test_#kasra#.csv
./geospatial/workshop_june_2023/annotations/buildings_#workshop#.csv

For consistency, they would probably be better changed into actual csv files. Or at least, renamed with the suffix "tsv".

It's also a bit unusual to find "#" in the names of files. If there's no particular good reason for that, I'd suggest using another name scheme.

One curiosity, I see that the ROC AUC goes up to 100. I suppose this is percentage. I'm more used to see it go up to 1, but if it is percentage, I think adding the "%" would help.

Documentation

In https://mapreader.readthedocs.io/en/latest/Worked-examples/Worked-examples.html it says:

MapReader can also for non-geospatial images

which is just missing "can also be used for" or similar.

Also, in section "Classification of plant phenotypes", where it says "In our plant phenotypes example [...]. It can be found here." the link in "here" is missing an initial "h" (it's linking to "ttps://github.com/Living-with-machines/MapReader/blob/main/worked_examples/non-geospatial/classification_plant_phenotype/Pipeline.ipynb").

In https://mapreader.readthedocs.io/en/latest/User-guide/Download.html the section "Downloader" just says:

TBC

Well... :)

As I said, they are minor comments. I think it's all basically okay, and actually a very nice piece of software. Congrats!

kmcdono2 commented 6 months ago

@jordibc @PipGrylls and @geekysquirrel - thanks for all these notes!

I wanted to say a few words about the README and design of names for notebooks. There is a long history to the origins of these notebooks, and we can and should do better in linking up what is referred to in the documentation and what is visible in the directory of worked examples. We'll get on this asap!

rwood-97 commented 6 months ago

Hi, quick update from me:

Installation

I've just merged #384 which updates our minimum python version to 3.9 and adds cartopy as a required dependency. This should simplify installation and fix the above issues with installing cartopy (I now realise its to do with a C++ library that is required for older versions of cartopy but not for the most recent version and so had been dropped from the cartopy docs).

Tests

I've created #385 to address the warnings in tests and to add ensure documentation on how to run them is clearly signposted. Will try and get on this asap. We also have #183 re. the mocking of external calls to NLS tileserver and will move this up our to do list.

Docs /worked examples

Re. worked examples, we've just created #386 to focus on this. I will add some info on how to clone the repo and run the notebooks for users that want to run the worked examples vs just use them for guidance. We also have an old ticket #234 about this with an idea for creating a tool for doing this. I've also just created #389 for the typos so I don't forget.

emdupre commented 6 months ago

Thank you very much to @PipGrylls, @geekysquirrel, and @jordibc for your careful reviews and for summarizing your current concerns here in-thread ! I really appreciate your efforts and time in supporting this JOSS review 💐

Thank you, too, for @kmcdono2 and @rwood-97 for being responsive to these points and for linking the associated issues in-thread ! I will monitor these to confirm how revisions are progressing. Please also feel free to tag me directly, either in these issues or on this global review thread.

Thank you all again for your work to date, and please don't hesitate to ping me here or via email if you have any questions or concerns at this point.

emdupre commented 5 months ago

👋 Hi @rwood-97 @kmcdono2 ! I just wanted to check in to see how this response is going for you.

I see that https://github.com/Living-with-machines/MapReader/pull/396 is currently open (in draft) to address https://github.com/Living-with-machines/MapReader/issues/385. I'm therefore assuming that this is still a work-in-progress, but please let me know if you have any specific concerns that I should be aware of !

kmcdono2 commented 5 months ago

Hi @emdupre - we're on it, just have a few things happening this and next week (including a big workshop about MapReader). We should be able to wrap things up the first week of May!

rwood-97 commented 5 months ago

Hi, sorry I forgot to update on this. As Katie said we have this workshop next week and for https://github.com/Living-with-machines/MapReader/pull/396 one of the changes is relatively big so I wanted to do it after our workshop to avoid introducing any new bugs. We should be able to address #386 and #389 this week though.

rwood-97 commented 5 months ago

Updates from us:

Installation

[x] #384 is done. This updates our minimum python version to 3.9 and adds cartopy as a required dependency.

Tests

[x] #385 is done. This addresses the warnings in tests and adds documentation on how to run them.
[x] #183 is done. This adds mocking of external requests to NLS tileserver in our tests.

Docs /worked examples

[x] #386 is done. This adds some info on how to clone the repo, run the notebooks for users that want to run the worked examples vs just use them for guidance.
[x] #389 is done. This fixes the typos mentioned in https://github.com/openjournals/joss-reviews/issues/6434#issuecomment-2041626766

Was there anything else we needed to fix/update? If not, I think this is all done from our end.

jordibc commented 4 months ago

Hi @rwood-97 ,

This looks good, thanks for the changes! Alas, for me the tests still produce thousands of errors -- I just tried with the updated branch main, and with both python 3.9 and 3.10. If you cannot reproduce it, I'll be happy to share the warnings. Just for reference, this is with: conda create -n mapreader python=3.0, conda activate mapreader, pip install "mapreader[dev]", and python -m pytest ..

A minor comment is that I still think it would be nice to mention how to run the tests in the main README.md (and not only in https://mapreader.readthedocs.io/en/latest/Contribution-guide/Code.html). Probably mentioning that pip install "mapreader[dev]" would get all that is needed for a full installation. It could all be mentioned for example just before "New users should refer to the Installation instructions and Input guidance for help with the initial set up of MapReader" (so they are directed there if they need more detailed installation instructions). If you don't think that's a good change to make, please just let me know.

Other than that, it's all good!

jordibc commented 4 months ago

Sorry, one more comment about the updates. Other things (of little importance) that are not addressed are the ones that I'm copying below. Please just let me know if you don't want to change it:

I also noticed that a couple of csv files are actually tab-separated (so, tsv):

./non-geospatial/classification_plant_phenotype/annotations_phenotype_open_access/phenotype_test_#kasra#.csv
./geospatial/workshop_june_2023/annotations/buildings_#workshop#.csv

For consistency, they would probably be better changed into actual csv files. Or at least, renamed with the suffix "tsv".

It's also a bit unusual to find "#" in the names of files. If there's no particular good reason for that, I'd suggest using another name scheme.

One curiosity, I see that the ROC AUC goes up to 100. I suppose this is percentage. I'm more used to see it go up to 1, but if it is percentage, I think adding the "%" would help.

rwood-97 commented 4 months ago

This looks good, thanks for the changes! Alas, for me the tests still produce thousands of errors -- I just tried with the updated branch main, and with both python 3.9 and 3.10. If you cannot reproduce it, I'll be happy to share the warnings. Just for reference, this is with: conda create -n mapreader python=3.0, conda activate mapreader, pip install "mapreader[dev]", and python -m pytest ..

Hi, I just tried to reproduce this and wonder what directory you are running python-m pytest . from? I accidentally just ran this from my home directory (as opposed to MapReader directory) and it was collecting tests and logging errors for ages and so wonder if this is the issue. When I then did cd MapReader and ran again it all passed.

A minor comment is that I still think it would be nice to mention how to run the tests in the main README.md (and not only in https://mapreader.readthedocs.io/en/latest/Contribution-guide/Code.html). Probably mentioning that pip install "mapreader[dev]" would get all that is needed for a full installation. It could all be mentioned for example just before "New users should refer to the Installation instructions and Input guidance for help with the initial set up of MapReader" (so they are directed there if they need more detailed installation instructions). If you don't think that's a good change to make, please just let me know.

Hi, we've also added how to run tests in the Install instructions page of the docs. We'd like to keep everything in the docs vs. README as much as possible so its all in one place. Let me know if you think this needs more fleshing out though.

I also noticed that a couple of csv files are actually tab-separated (so, tsv):
./non-geospatial/classification_plant_phenotype/annotations_phenotype_open_access/phenotype_test_#kasra#.csv
./geospatial/workshop_june_2023/annotations/buildings_#workshop#.csv
For consistency, they would probably be better changed into actual csv files. Or at least, renamed with the suffix "tsv".

Sorry I missed this one, have just changed it now.

It's also a bit unusual to find "#" in the names of files. If there's no particular good reason for that, I'd suggest using another name scheme.

The # is to do with being able to split the names back out (i.e. with str.split("#")) later on.

One curiosity, I see that the ROC AUC goes up to 100. I suppose this is percentage. I'm more used to see it go up to 1, but if it is percentage, I think adding the "%" would help.

Again, sorry I missed this one, have just updated all the notebooks to have (%) in their y-labels.

jordibc commented 4 months ago

Thanks! Everything looks good!

Hi, I just tried to reproduce this and wonder what directory you are running python-m pytest . from? I accidentally just ran this from my home directory (as opposed to MapReader directory) and it was collecting tests and logging errors for ages and so wonder if this is the issue. When I then did cd MapReader and ran again it all passed.

I was running from the main MapReader directory, but it seems I was running with an installation from PyPI (pip install "mapreader[dev]", instead of using -U to update, or a fresh pip install -e ".[dev]").

Indeed all those thousands of warnings are gone, great! With python 3.10 I got a few warnings (91) that I don't think are important but I'm copying just for reference:

tests/test_annotator.py: 18 warnings
tests/test_classify/test_datasets.py: 9 warnings
tests/test_geo_pipeline.py: 1 warning
tests/test_load/test_geo_utils.py: 1 warning
tests/test_load/test_images.py: 55 warnings
  /data/dotstuff/local/mambaforge/envs/mapreader/lib/python3.10/site-packages/rasterio/__init__.py:304: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
    dataset = DatasetReader(path, driver=driver, sharing=sharing, **kwargs)

tests/test_classify/test_classifier.py::test_init_resnet18_hf
tests/test_classify/test_classifier.py::test_infer_hf_models
  /data/dotstuff/local/mambaforge/envs/mapreader/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
    warnings.warn(

tests/test_classify/test_classifier.py::test_init_resnet18_hf
tests/test_classify/test_classifier.py::test_infer_hf_models
  /data/dotstuff/local/mambaforge/envs/mapreader/lib/python3.10/site-packages/transformers/models/convnext/feature_extraction_convnext.py:28: FutureWarning: The class ConvNextFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ConvNextImageProcessor instead.
    warnings.warn(

tests/test_classify/test_classifier.py::test_calculate_add_metrics
tests/test_classify/test_classifier.py::test_calculate_add_metrics
tests/test_classify/test_classifier.py::test_calculate_add_metrics
  /data/dotstuff/local/mambaforge/envs/mapreader/lib/python3.10/site-packages/sklearn/metrics/_classification.py:1509: UndefinedMetricWarning: Recall is ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
    _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))

tests/test_geo_pipeline.py::test_pipeline
tests/test_geo_pipeline.py::test_pipeline
tests/test_geo_pipeline.py::test_pipeline
  /data/dotstuff/local/mambaforge/envs/mapreader/lib/python3.10/site-packages/sklearn/metrics/_classification.py:1509: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
    _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
======================================================== 189 passed, 94 warnings

Hi, we've also added how to run tests in the Install instructions page of the docs. We'd like to keep everything in the docs vs. README as much as possible so its all in one place. Let me know if you think this needs more fleshing out though.

Yes, keeping it in one place which is easily found makes total sense to me. Thanks!

The # is to do with being able to split the names back out (i.e. with str.split("#")) later on.

I see, thanks for explaining.

One curiosity, I see that the ROC AUC goes up to 100. I suppose this is percentage. I'm more used to see it go up to 1, but if it is percentage, I think adding the "%" would help.

Again, sorry I missed this one, have just updated all the notebooks to have (%) in their y-labels.

No worries at all, it was totally minor. I see it is all changed and I'm happy to sign off on this.

Thanks again for your prompt replies and congrats on a great work!

rwood-97 commented 4 months ago

Great, very pleased to know its working!

Re. warnings, the rasterio ones are all to do with loading a non-georeferenced image (i.e. png file) in rasterio. Its part of a try/except where we try get georeferencing information from the image (e.g. if you loaded a tiff it could have it but not always) so we do it for all images. We probably should work out a way to suppress these warnings. The other ones I also get but hadn't worked out/got around to addressing and don't (yet) seem to be a problem. :)

emdupre commented 4 months ago

Thank you @jordibc for reviewing these changes and @rwood-97 for your quick responses !

@PipGrylls, @geekysquirrel, please let us know when you also are able to re-review this submission for MapReader.

And thank you again to @ everyone for your work on this review to date !

emdupre commented 4 months ago

Hi everyone,

I just wanted to follow-up on this submission for MapReader.

Currently, it looks like @geekysquirrel 's testing concerns in https://github.com/Living-with-machines/MapReader/issues/369 is still open ; could you please confirm that this is waiting for reviewer input @rwood-97 @kmcdono2 ?

Otherwise, @PipGrylls had raised this point on example usage:

Currently the documentation for worked examples is unclear. The docs link to this. I think a README here explaining how to launch the examples and a list of examples is whats needed. Running the annotation notebooks and the worked_examples/geospatial/classification_one_inch_maps/Pipeline.ipynb notebook seems like this is well done with just some signposting needed.

I believe https://github.com/Living-with-machines/MapReader/issues/386 is intended to address this, but it would be great to get confirmation from both the authors as well as @PipGrylls himself !

Otherwise, please let us know if you have any remaining comments @jordibc @geekysquirrel @PipGrylls , and thank you again for all of your work to date !

kmcdono2 commented 4 months ago

Thanks @emdupre - I have added a note to https://github.com/Living-with-machines/MapReader/issues/369, but @rwood-97 and I understand this to be resolved.

Likewise, https://github.com/Living-with-machines/MapReader/issues/386 is done from our end!

PipGrylls commented 4 months ago

Hi,

I've looked at the changes in 386 and I am happy they address my concerns.

PipGrylls commented 4 months ago

I’ve installed mapreader and run the tests on a M2 Mac and it now works well.

My previous concern as having the recommended install pathway mixing conda and pip remains but isn't a blocker. In an ideal world there would be a:

'conda install’: using conda and the anaconda channels
‘pypi install’: using venv/virtual environments and PyPi
'source install’: using venv/virtual environments and the Github (this would be the way to run the tests)

I think finally I would recommend having the example notebooks being cloned from a repo that does not also contain the source code so people can clone only them. (you could then use submodules to make this part of the source module to avoid maintaining multiple versions)

rwood-97 commented 4 months ago

I think finally I would recommend having the example notebooks being cloned from a repo that does not also contain the source code so people can clone only them. (you could then use submodules to make this part of the source module to avoid maintaining multiple versions)

I like this idea, will make an issue on the MapReader repo to address this at some point (probably won't happen soon but good to have in our backlog)

emdupre commented 3 months ago

Hi everyone,

Confirming that I've now heard back from all reviewers, and I will now go ahead and perform a few editorial checks to continue processing this submission. Thank you again @PipGrylls, @jordibc, @geekysquirrel for your reviews of MapReader for JOSS ! 💐

openjournals / joss-reviews

[REVIEW]: MapReader: v1.1.0 #6434

Status

Reviewer instructions & questions

Checklists

Review checklist for @PipGrylls

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @jordibc

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @geekysquirrel

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Installation

Tests

Notebooks

Documentation

Installation

Tests

Docs /worked examples

Installation

Tests

Docs /worked examples