openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
707 stars 37 forks source link

[REVIEW]: Dolphin: A Python package for large-scale InSAR PS/DS processing #6997

Open editorialbot opened 2 months ago

editorialbot commented 2 months ago

Submitting author: !--author-handle-->@scottstanie<!--end-author-handle-- (Scott Staniewicz) Repository: https://github.com/isce-framework/dolphin Branch with paper.md (empty if default branch): joss Version: v0.19.0 Editor: !--editor-->@mikemahoney218<!--end-editor-- Reviewers: @McWhity, @margauxmouchene Archive: Pending

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/22654c1f71df2ecf3ada009499ed8961"><img src="https://joss.theoj.org/papers/22654c1f71df2ecf3ada009499ed8961/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/22654c1f71df2ecf3ada009499ed8961/status.svg)](https://joss.theoj.org/papers/22654c1f71df2ecf3ada009499ed8961)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@McWhity & @margauxmouchene, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @mikemahoney218 know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @margauxmouchene

📝 Checklist for @McWhity

editorialbot commented 2 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 2 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1109/TGRS.2017.2711037 is OK
- 10.1109/TGRS.2018.2826045 is OK
- 10.1109/TGRS.2020.3003421 is OK
- 10.1364/JOSAA.18.000338 is OK
- 10.1109/TGRS.2011.2164805 is OK
- 10.1002/2015GL065031 is OK
- 10.1109/TGRS.2011.2124465 is OK
- 10.1109/TGRS.2014.2352853 is OK
- 10.1109/TGRS.2008.2001756 is OK
- 10.3390/rs14020390 is OK
- 10.1016/j.cageo.2022.105291 is OK
- 10.1109/IGARSS.2018.8517504 is OK
- 10.1109/LGRS.2010.2083631 is OK
- 10.6028/jres.066D.020 is OK
- 10.1109/TGRS.2022.3210868 is OK
- 10.1016/j.cageo.2019.104331 is OK
- 10.1109/LGRS.2022.3197423 is OK

MISSING DOIs

- No DOI given, and none found for title: Introducing the OPERA Project for Systematic Surfa...
- No DOI given, and none found for title: JAX: Composable Transformations of Python+NumPy Pr...
- No DOI given, and none found for title: FRInGE; Full-Resolution InSAR Timeseries Using Gen...

INVALID DOIs

- None
editorialbot commented 2 months ago

Software report:

github.com/AlDanial/cloc v 1.90  T=0.17 s (917.0 files/s, 175290.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                         115           3953           5333          14623
CSV                              1              0              0           1201
Markdown                        11            290              0            667
TeX                              2             38              0            498
YAML                            11             39             35            433
Bourne Shell                     3             26             26            148
Jupyter Notebook                 2              0           1332            132
Ruby                             1             28             12            106
TOML                             1             19             15             93
JSON                             3              0              0             90
Dockerfile                       1             19             18             47
JavaScript                       1              3              0             15
make                             1              2              0              5
-------------------------------------------------------------------------------
SUM:                           153           4417           6771          18058
-------------------------------------------------------------------------------

Commit count by author:

   511  Scott Staniewicz
    44  pre-commit-ci[bot]
    11  Sara Mirzaee
     4  Simran S Sangha
     4  scott
     3  Geoffrey Gunter
     2  Ryan Burns
     2  dependabot[bot]
     1  Geoffrey M Gunter
     1  seyeonjeon
editorialbot commented 2 months ago

Paper file info:

📄 Wordcount for paper.md is 942

✅ The paper includes a Statement of need section

editorialbot commented 2 months ago

License info:

🟡 License found: Other (Check here for OSI approval)

editorialbot commented 2 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

mikemahoney218 commented 2 months ago

👋🏼 @scottstanie @McWhity @margauxmouchene this is the review thread for the paper. Just about all of our communications will happen here from now on 😄

As a reviewer, the first step is to create a checklist for your review by entering

@editorialbot generate my checklist

as the top of a new comment in this thread.

These checklists contain the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. The first comment in this thread also contains links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention openjournals/joss-reviews#6997 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for reviews to be completed within about 4 weeks. Please let me know if you require some more time.

Please feel free to ping me (@mikemahoney218) if you have any questions/concerns!

margauxmouchene commented 1 month ago

Review checklist for @margauxmouchene

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

margauxmouchene commented 1 month ago

Hello @scottstanie, I am just starting the review and testing of the software, but I am excited to test it further! So far, install following the instructions went well and all the tests passed.

A first little list of comments/questions:

Let me know if I should move some of these to the Issues in dolphin's repo to help you track changes.

Thank you!

scottstanie commented 1 month ago

Thanks for the quick response @margauxmouchene ! Sorry for the slow response, but I should be able to get to the rest of the comments in a few days.

For the first couple points,

can you please clarify the paper's authors contributions to the code:

I wanted to check- did you want a clarification added to the paper? or just here for the review? The short answer is that Heresh Fattahi made the original fringe and has been the project lead for dolphin, and Talib Oliver-Cabrera made/tested the first version of the interpolation module, but somehow isn't the commit author on github (might have been me who moved it due to odd logistics). The others hadn't been on the algorithm team making dolphin code when I wrote the draft in Jan/Feb., but I can revisit the author list with Heresh.

the tests require snaphu-py, isce3 and tophu to run but they are not listed in tests\requirements.txt

That's a good idea to add snaphu-py to the test requirements, but tophu/isce3 are unforunately difficult to install outside of conda, and tophu can't be pip installed without already having isce3 in your environment. It's why the github action is set up with an extra spec at the moment.

but the link is broken

Thank you! That's now fixed on the live docs

I'll try to respond to the other couple comments ASAP, and a newer test dataset will be added to the walkthrough that should be easily downloadable/runnable on Colab

McWhity commented 1 month ago

Hello @scottstanie First of all, let me apologize for not interacting sooner. I will start the review process right away. The next two days are primarily reserved for working on the review. I hope to get some things done. I am looking forward to working on this.

McWhity commented 1 month ago

Review checklist for @McWhity

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

scottstanie commented 1 month ago

Thanks for the patience @margauxmouchene !

dolphin was developed to meet the need for a tool capable of handling the heavy computational demands of large-scale InSAR processing: how does dolphin address this need? what technology/characteristics of this code make it computationally performant for large datasets (such as OPERA's products)?

I've added some clarifying points to the Overview to try and answer this:

what is dolphin's place within the isce-framework project? how does it relate to isce2?

I think it may be a little out of scope to fully explain isce2/3? But here's a rundown, and you can let me know if you think it's worth adding:

isce2 and isce3 are broad, framework-like libraries which contain low-level components for building InSAR processing applications. For instance, both libraries have C++ (or Fortran for isce2) modules to focus raw data into SAR images, or signal processing algorithms to resample one image onto another for co-registration. isce3 is full re-write of isce2 that has been funded by NISAR algorithm development, while isce2 is now run by the InSAR community.

Dolphin starts with a stack of focused, co-registered SAR images; for example, you can use the outputs of the "Coregistered SLC stack workflow" of isce2 and run dolphin on these images. However, since there are now Sentinel-1 geocoded SLC images (which are already co-registered) available on ASF, you can get a displacement output in a fraction of the time by downloading these and running dolphin (see the walkthrough notebook).

Both libraries have the pieces to run a multi-temporal analysis, but the workflows which put them together for isce2 are in the user-contributed "contrib" folder. The highest level that isce2 is set up to process is interferogram formation/unwrapping; most users then pass the results onto to MintPy to obtain one displacement time series. The time series algorithms that Dolphin runs are newer and more advanced than what exists in isce2/3, though it also has the pieces to do simple multi-looked interferogram formation like isce2.

can you please clarify the paper's authors contributions to the code:

A bit more on the author contributions:

The others who are on the committer list but not here as authors were either added much later to the Algorithm Team or are on separate teams building future validation for dolphin.

Also, after a little more internal discussion, Emre Havazli was added to the author list as one of the main testers/debuggers of dolphin for many case studies over the past year. It was an oversight to not have him on in the first place. He is not one of the committers (similar to Talib), but has been very helpful reporting all errors to me and finding improvements for me to implement.

is the dataset used to produce Figure 1 in the paper available? otherwise, is there another sample dataset available for testing?

I have updated the walkthrough notebook in this PR; it should even be testable on Colab now


No problem at all on the later start @McWhity , I was slow to answer @margauxmouchene 's points anyway. Sorry to push changes right after you started to address her points, but it should now be simpler to test the basic walkthrough.

margauxmouchene commented 1 month ago

@editorialbot generate pdf

editorialbot commented 1 month ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

margauxmouchene commented 1 month ago

I've added some clarifying points to the Overview to try and answer this

Thank you for the clarification. I think this part is clear enough now.

I think it may be a little out of scope to fully explain isce2/3?

Yes, exactly, that was more for my own understanding and your explanation was super clear. Thank you!

A bit more on the author contributions

Looks good to me!

I have updated the walkthrough notebook

I am testing it right now, this is looking good and very user-friendly!

Some errors I ran into: In the section Input dataset

with open(url_file, "w") as f:
    f.write("\n".join(best_option.inputs))

raises AttributeError: 'BurstSubsetOption' object has no attribute 'inputs'

I could not find the urls in best_option so I searched the same polygon on the ASF website and put relevant urls in the urls.txt file to be able to download the dataset and run the rest of the code without too much change.

I have to stop here and I will be away from my computer for the rest of the week but I'll resume testing/reviewing early next week.

Have a good week!

margauxmouchene commented 1 month ago

Also, on the paper:

  1. Line 21: period missing at the end of the sentence
  2. Fattahi et al. ref: missing the mention "AGU Fall Meeting Abstracts", maybe? (suggested citation here)
  3. Rosen et al. ref: missing "IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium", maybe?
scottstanie commented 1 month ago

I am testing it right now, this is looking good and very user-friendly!

Some errors I ran into: In the section Input dataset

Shoot I'm sorry I couldn't get these changes in- I saw that password typo after I pushed up the first version, and I made a release of the opera-utils library that does that data search yesterday. Maybe I should have just hard-coded the URLs for this demo.

Also, on the paper:

1. Line 21: period missing at the end of the sentence

2. Fattahi et al. ref: missing the mention "AGU Fall Meeting Abstracts", maybe? (suggested citation [here](https://ui.adsabs.harvard.edu/abs/2019AGUFM.G11B0514F/exportcitation))

3. Rosen et al. ref: missing "IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium", maybe?

I'll double check these citations and make the fixes.

Have a good week!

you as well!

McWhity commented 1 month ago

Hi @scottstanie I run into similar problem as @margauxmouchene.

Here are some first remarks/questions

  1. @scottstanie please let me know as soon as you fixed the AttributeError for downloading the test dataset

    raises AttributeError: 'BurstSubsetOption' object has no attribute 'inputs'

  2. I assume that the branch "main" is the one that we should test. Am I correct?
  3. Regarding the readme
    • Please add the full name for InSAR and PS/DS as especially PS/DS are not known outside of the community (similar to how it is mentioned in the paper manuscript)
  4. Regarding the license
    • For clarification I would recommend to indicate that the first section is the BSC-3-Clause. You could add the name before “Copyright (c) 2022 California …” similar to the section for the Apache License.
  5. Regarding the authors contributions Scott J. Staniewicz lead author (511+4) Sara Mirzaee (11) Geoffrey M. Gunter (4) Tablib Oliver-Cabrera (not committing author) Heresh Fattahi (lead for dolphin project) Not mentioned which is ok based on the low commit count (Ryan Burns (2), Simran Sangha (4), seyeonjeon (1))
  6. I would recommend to include the changelog in documentation
  7. Regarding the checklist point "Substantial Scholarly Effect"
    • If applicable please add within the paper manuscript if the software were used to process data or if results are already published in other papers. Further you could indicate/cite projects were dolphin is currently used.
  8. Broken links

More comments will follow

scottstanie commented 1 month ago

Thank you @McWhity !

Hi @scottstanie I run into similar problem as @margauxmouchene.

Here are some first remarks/questions

1. @scottstanie please let me know as soon as you fixed the AttributeError for downloading the test dataset

raises AttributeError: 'BurstSubsetOption' object has no attribute 'inputs'

Sorry that it didn't work for you. Just checking, how did you run it? I just checked the "testable on Colab now" where it runs a pip install dolphin pip install opera-utils, and I didn't get that error. the dolphin.show_versions() call should show 0.12.0 for opera_utils.

Maybe the fix was poorly timed and you had already installed the software earlier?

2. I assume that the branch "main" is the one that we should test. Am I correct?

Yes main is good. joss has the paper, but I have been merging the changes from main into joss, so both are fine to test.

3. Regarding the [readme](https://github.com/isce-framework/dolphin/blob/main/README.md)

* Please add the full name for InSAR and PS/DS as especially PS/DS are not known outside of the community (similar to how it is mentioned in the paper manuscript)

Thanks, added here

4. Regarding the [license](https://github.com/isce-framework/dolphin/blob/main/LICENSE)

* For clarification I would recommend to indicate that the first section is the BSC-3-Clause. You could add the name before “Copyright (c) 2022 California …” similar to the section for the Apache License.

Added BSD clarification here, looks like the Apache section already starts with "Apache License"

5. Regarding the authors contributions
   Scott J. Staniewicz lead author (511+4)
   Sara Mirzaee (11)
   Geoffrey M. Gunter (4)
   Tablib Oliver-Cabrera (not committing author)
   Heresh Fattahi (lead for dolphin project)
   Not mentioned which is ok based on the low commit count (Ryan Burns (2), Simran Sangha (4), seyeonjeon (1))

* I would suggest to include a credit of author statement in the paper manuscript. Similar to the ones, journals like elsevier (https://www.elsevier.com/researcher/author/policies-and-guidelines/credit-author-statement) or mdpi have. But of course adjusted to the special needs of software development. As I do not know the opinion/options in place by JOSS it might be good if @mikemahoney218 could comment on that to.

I'd be interested in any JOSS suggestion about how to add that.

6. I would recommend to include the changelog in documentation

Added here

7. Regarding the checklist point "Substantial Scholarly Effect"

* If applicable please add within the paper manuscript if the software were used to process data or if results are already published in other papers. Further you could indicate/cite projects were dolphin is currently used.

The main research work using Dolphin is still ongoing as part of the OPERA project. I'll double check if there are any citations that make sense to add for that which currently exist.

8. Broken links

* https://dolphin-insar.readthedocs.io/en/latest/getting-started/ penultimate line "notebooks"

* https://dolphin-insar.readthedocs.io/en/latest/notebooks/walkthrough-basic/ under Setup the link to "Getting Started"

Thank you! Fixed

mikemahoney218 commented 2 weeks ago

Hi @margauxmouchene and @McWhity ! I'll admit I've lost track of this review a little bit -- would you be able to let me know how far you are into your review?

scottstanie commented 2 weeks ago

Also, on the paper:

1. Line 21: period missing at the end of the sentence

2. Fattahi et al. ref: missing the mention "AGU Fall Meeting Abstracts", maybe? (suggested citation [here](https://ui.adsabs.harvard.edu/abs/2019AGUFM.G11B0514F/exportcitation))

3. Rosen et al. ref: missing "IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium", maybe?

Sorry that these slipped, I've push up the fixes to the JOSS branch.

margauxmouchene commented 2 weeks ago

Hello!

@scottstanie with the fix you made, everything runs smoothly for me (I reinstalled everything).

In the 'walkthrough-basics' notebook, in the 'Visualization the displacement' (typo maybe?) section, in the last executable code block:

But everything else works, it's easy to follow and it's a great starting point to learn how to use dolphin.

In the other notebook ('simulate-demo'):

@mikemahoney218 I am almost done with the review, I just want to test the GPU speedup, which is one of the functionality claims of the paper. I should be able to wrap this all up before the end of the week.

scottstanie commented 2 weeks ago

Thanks @margauxmouchene !

In the 'walkthrough-basics' notebook, in the 'Visualization the displacement' (typo maybe?) section, in the last executable code block:

* there is a missing `import numpy as np` somewhere (`np.unravel_index` is called)

* `io.load_gdal` should be `dolphin.io.load_gdal`

* `pixels` is not defined

Thanks for that, I had someone else point that out just after you started your review, fixed in this commit

But everything else works, it's easy to follow and it's a great starting point to learn how to use dolphin.

In the other notebook ('simulate-demo'):

* `samps3d = simulate.create_noisy_deformation(C, defo_stack)`
  returns the error:
  `AttributeError: module 'dolphin.phase_link.simulate' has no attribute 'create_noisy_deformation' `

Ah thanks for catching that! I fixed my function renaming here.

McWhity commented 1 week ago

@mikemahoney218: I hope to finish the review beginning of next week

mikemahoney218 commented 1 week ago

Thank you, @McWhity and @margauxmouchene !

margauxmouchene commented 1 week ago

Hello @mikemahoney218, @scottstanie, @McWhity, I think I am done reviewing, all my comments were addressed. I was not really able to test the GPU acceleration but this is due to config issues on my side. Hopefully, @McWhity can test this properly? Everything else is good! Thank you for the opportunity, this project is awesome!

mikemahoney218 commented 4 days ago

Thank you so, so much @margauxmouchene !

McWhity commented 4 days ago

Hello @scottstanie, @margauxmouchene, @mikemahoney218, I think I am done with my review. So far it looks good. However there are a view things that need to be addressed. Main issues are

I tried to group my comments based on

Paper

Readme

Documentation

Section Getting started

Dolphin basic walkthrough

Contributing

Missing section/information

What I am looking for is a graphical explanation of the capabilities that dolphin provides. Thus, users that are working in SAR but not necessarily in In-SAR can understand how dolphin can help them in archiving there desired goal.

Testing GPU setup

Documentation

I could not get the GPU up and running (Problems)

I had to search but I found a computer with NVIDIA card with CUDA support. However, I did run in some problems.

Attempt 1: NVIDIA driver with cuda 11.6

pip install --upgrade "jax[cuda11_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html Warning: jax 0.4.31 does not provide the extra 'cuda11.pip'

When running the dolphin_config.yaml it get

Warning: An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.

Installed

jax 0.4.31
jaxlib 0.4.31

stated dependencies within dolphin are jax>=0.4.19

Attempt 2+3: NVIDIA driver with cuda 12.4 / 12.6

before installing jax I get the same Warning as during with cuda 11.6 Warning: An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.

After installing pip install --upgrade "jax[cuda12_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

I get the follwing Error

E external/xla/xla/stream_executor/cuda/cuda_driver.cc:216] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

By the way mamba install cudatoolkit=12.6 does not work as only cudatoolkit up to 11.8 can be found. Following Error occurs cudotoolkit 12.6** does not exist (perhaps a typo or a missing channel).

Following code snippit does work mamba install nvidia/label/cuda-12.6.0::cuda-toolkit

However I get the Error

E external/xla/xla/stream_executor/cuda/cuda_driver.cc:216] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

nvcc --version provides
nvcc:: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Fri_Jun_14_16:36:21_PDT_2024
Cuda compilation tools, release 12.6 V12.6.20
Buzild cuda_12.6.r12.65/compiler.34431801_0

By the way that happens although within dolphin_config.yaml gpu_enabled: false It I change it to true the error is not shown but the script is somehow stopping

Running tests

I usually run the tests by using python -m unittest discover -s tests. But this somehow did not work for me. One problem was that I had to install further packages like pytest, boto3, moto etc. The requirements are listed in the tests folder, however as the tests are part of the software please make your that all dependencies are installed during the initial installation process. After I installed some dependencies I did run python -m unittest discover -s tests again but the output was

Ran 0 tests in 0.000s
NO TESTS RAN

This is were I stopped further investigating. Thus please make your that the tests can be run properly. Just out of curiosity: How did you run the test so far?

Final comment: This is a really good and useful python packages.

margauxmouchene commented 3 days ago

What I am looking for is a graphical explanation of the capabilities that dolphin provides

+1, great idea!

Also, I had similar issues installing cuda and ended up installing cuda-toolkit 12.2 (tried different options), then the script runs without failing but never finishes.

scottstanie commented 3 days ago

Thank you for all the comments! i'll try to get more detailed responses to more of it shortly, but about your last point on the tests and unit test setup-

https://dolphin-insar.readthedocs.io/en/latest/developer-setup/

I've linked the CONTRIBUTING info here:

I think the fact that this was unclear to you is a point in favor of what you said- "the info is scattered between JOSS, README, and the docs now". I'll make another pass and try to make sure that the development setup is clear in both docs and the readme.

As for the GPU setup... :\ I'm currently annoyed at how much of a moving target it feels like setting up any GPU libraries are. The first attempt I had on our linux server, setting up the conda environment Just Worked. But then the next time, I also had to reinstall JAX via pip to make sure it picked the right one for the CUDA libraries we had.

I'll take another look at the errors you posted and see if I can think of a clear way to augment the JAX/Numba GPU setup instructions.