ropensci / software-review

rOpenSci Software Peer Review.
291 stars 104 forks source link

Submission: colocr #243

Closed MahShaaban closed 5 years ago

MahShaaban commented 6 years ago

Summary

Package: colocr
Type: Package
Title: Conduct Co-localization Analysis of Fluorescence Microscopy Images
Version: 0.1.0
License: GPL-3
Authors@R: person("Mahmoud", "Ahmed",
    email = "mahmoud.s.fahmy@students.kasralainy.edu.eg",
    role = c("aut", "cre"))
URL: https://github.com/MahShaaban/colocr
BugReports: https://github.com/MahShaaban/colocr/issues
Description: Automate the co-localization analysis of fluoresence microscopy 
  images. Selecting regions of interest, extract pixel intensities from 
  the image channels and calculate different co-localization statistics.
Encoding: UTF-8
LazyData: true
Suggests: testthat,
    covr,
    knitr,
    rmarkdown,
    devtools
RoxygenNote: 6.0.1
Imports: imager,
  shiny,
  scales
VignetteBuilder: knitr

[e.g., "data extraction, because the package parses a scientific data file format"] data extraction

Requirements

Confirm each of the following by checking the box. This package:

Publication options

Detail

maelle commented 6 years ago

I got a difference again... What I got was "cor": "Average PCC: 0.76 and Average MOC: 0.9" πŸ€”

I'll wait for your having fixed the Appveyor build before trying again.

MahShaaban commented 5 years ago

Hey @maelle Apologies for the late reply. I was away from my workstation since the beginning of the week. Now, Now that I installed phantomjs on appveyor, the app tests can run. The tests actually failed on appveyor in a way similar to the one you mentioned in this last comment. I can't replicate the error locally though! Could you please give me some information about your setup environment?

maelle commented 5 years ago

:wave: @MahShaaban!

My session info was in https://github.com/ropensci/onboarding/issues/243#issuecomment-421452708

Do you have access to a Windows machine?

MahShaaban commented 5 years ago

Thanks @maelle I checked the package versions, they are comparable to the ones I have and still can't reproduce the error. Assuming that the tests fail only on windows, what would be the way forward? I don't have an easy access to a windows machine, and I never really used R on one.

maelle commented 5 years ago

will have a look myself in the next days. we'll solve this sooner or later 😁

MahShaaban commented 5 years ago

Thanks a lot @maelle

maelle commented 5 years ago

I've just had a look, can't do much more at the moment.

maelle commented 5 years ago

I'm totally unexperimented with your app, hence my poor debugging. I was thinking that maybe when you set one of the input using the shinytest commands, the cursors move more or less than what you get on mac/Linux?

Btw on Travis do you test on on Linux? It'd be worth adding a mac build, just to see whether you only get the issue for Windows.

maelle commented 5 years ago

Or maybe it's due to a waiting time that's not enough on Windows? Could you try adding waiting time before the snapshot? Or even adding time between all commands?

End of my suggestions for today, sorry about that.

maelle commented 5 years ago

Maybe useful https://rstudio.github.io/shinytest/articles/in-depth.html#getting-input-output-and-export-values (to check the inputs have been set)

MahShaaban commented 5 years ago

Hey @maelle Here is what I tried recently.

Here is an idea. Since you can reproduce the errors on your local windows machine. I think it would be useful to see if the tests pass if you run them for the first time. To do that you need to remove the expected output folders from the app tests folder inst/colocr/tests/*-expected/. Once those two folders are removed, you can run the tests from the app directory shinytest::testApp(). This will give a message saying the tests are running for the first time and the expected images are being created. We can then compare the logs.

maelle commented 5 years ago

Doing it now... after removing the timeout_ arguments πŸ‘Ό

maelle commented 5 years ago

This is still quite slow and I get no message yet πŸ€”

MahShaaban commented 5 years ago

I wrote two tests to check whether the output of the app is reproduced using the same input parameters. Onc test used the colocr functions and another without. The tests passed locally and on Travis, while the two of them failed on Appveyor. In addition to the fact that, these are the only tests that actually check the numeric outputs (the co-localization stats), I think that neither the app nor the colocr functions are the root of the problem.

The tests are in a file called tests/test-reproduce_app.R in this last commit

PS. @seaaan noticed before that the part of the vignette where I check the reproduction of the app output from the same input returns FALSE. The code chunk is check_equal at line 313 of the vignette. Was this happening on a windows machine?

maelle commented 5 years ago

Interesting. Can you add a more minimal example using imager and data that's not in colocr so that I might run it and we can post in imager repo?

MahShaaban commented 5 years ago

I am not sure how to make a minimal example in this case. So far, I've been checking the final outputs of either colocr or imager and the tests passes locally and on travis but not appveyor. I am using multiple imager functions, and this difference could be due to any of them. So,

My current thought is to build imager on travis and appveyor and run this test. Meanwhile I am trying to figure out a way to identify the one or more functions that is causing the issue. I am not sure this is the smarts way to do it, but I think I can save all intermediary objects from the test run in an R object and compare them to a test run on windows/appveyor.

maelle commented 5 years ago

@MahShaaban I don't see the test script with download.files()? Could you please write share a gist without testthat? I'll then run it. I was thinking seeing imager:: would help, and at each point where you use an imager function if possible use () to show the output, this way it'll be easier to compare?

MahShaaban commented 5 years ago

Sorry, I forgot to link to the test script in the imager fork. Here, is a gist of the script. I updated the second revision to remove testthat calls.

MahShaaban commented 5 years ago

I traced the difference between the calculated correlations on ubuntu and windows to very first step in loading the images. I think this (dahtah/imager#41) is related, although neither the maintainer nor the user followed up on the issue yet!

I found that the pixel values of the images in the colocr package are loaded differently on the two platforms. This was at least in part for jpeg::readJPEG() which is used in imager::load.image() and colocr::image_load(). I noticed the same when tried with other images. Here is one from the jpeg package itself.

On ubuntu

> version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          5.1                         
year           2018                        
month          07                          
day            02                          
svn rev        74947                       
language       R                           
version.string R version 3.5.1 (2018-07-02)
nickname       Feather Spray
> packageVersion('jpeg')
[1] β€˜0.1.8’               
> fl <- system.file('img', 'Rlogo.jpg', package = 'jpeg')
> img <- jpeg::readJPEG(fl)
> mean(img)
[1] 0.7046421

On windows

> version
               _                           
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          3                           
minor          5.1                         
year           2018                        
month          07                          
day            02                          
svn rev        74947                       
language       R                           
version.string R version 3.5.1 (2018-07-02)
nickname       Feather Spray
> packageVersion('jpeg')
[1] β€˜0.1.8’               
> fl <- system.file('img', 'Rlogo.jpg', package = 'jpeg')
> img <- jpeg::readJPEG(fl)
> mean(img)
[1] 0.7047047

Notice the difference starting at the 4th decimal point. I am showing the mean here, but I visually inspected the values themselves and they look different. Could that be due to instability/inaccuracies at the very small decimal points?

I couldn't go beyond that as jpeg::readJPEG itself is a call to a source file, part of CImg C++ library as far as I can tell.

> jpeg::readJPEG
function (source, native = FALSE) 
.Call("read_jpeg", if (is.raw(source)) source else path.expand(source), 
    native, PACKAGE = "jpeg")
<bytecode: 0x2d11910>
<environment: namespace:jpeg>
MahShaaban commented 5 years ago

I used image_read from the magick package instead of load.image from imager and this seems to solve the issue of ubuntu/windows differences. Here, MahShaaban/colocr#3

maelle commented 5 years ago

Cool! In that case why not switch the whole package to magick? πŸ˜‰

jeroen commented 5 years ago

I would second maelle's suggestion to try and switch to magick. It is a much more comprehensive and reliable image toolkit. Is there particular functionality in imager that you are missing from magick?

MahShaaban commented 5 years ago

Thanks, @maelle @jeroen for the suggestion. I certainly don't mind looking into that.

Although the package currently relies heavily on imager, I don't mind switching to magick. I went through the magick vignette and I think the classes and the basic image transformations that I'd need are already there. However, I don't see the equivalent/alternatives to the Morphological Operations in imager, namely shrink(), grow(), fill() and clean(). Or am I missing something in magick that could replace these?

Here is the relevant part from the NAMESPACE

importFrom(imager,clean)
importFrom(imager,fill)
importFrom(imager,grow)
importFrom(imager,shrink)
importFrom(imager,threshold)
jeroen commented 5 years ago

Thanks, see also this issue: https://github.com/ropensci/magick/issues/136

In the latest dev version of magick you can find the morphology methods with morphology_types():

> morphology_types()
 [1] "Undefined"         "Correlate"         "Convolve"          "Dilate"           
 [5] "Erode"             "Close"             "Open"              "DilateIntensity"  
 [9] "ErodeIntensity"    "CloseIntensity"    "OpenIntensity"     "DilateI"          
[13] "ErodeI"            "CloseI"            "OpenI"             "Smooth"           
[17] "EdgeOut"           "EdgeIn"            "Edge"              "TopHat"           
[21] "BottomHat"         "Hmt"               "HitNMiss"          "HitAndMiss"       
[25] "Thinning"          "Thicken"           "Distance"          "IterativeDistance"
[29] "Voronoi"       

I think the main features you use are:

I'm not sure what exactly imager::clean does under the hood, but the imagemagick morphology manual explains several morphology methods that can be used for cleaning. We also have a function magick::image_despeckle().

For thresholding you can try image_threshold() or you can try some of the morphology methods.

MahShaaban commented 5 years ago

Sounds great. I will be looking into that. Thanks @jeroen

maelle commented 5 years ago

Approved! Thanks @MahShaaban for submitting and @seaaan @haozhu233 for your reviews! 😸

To-dos:

Should you want to awknowledge your reviewers in your package DESCRIPTION, you can do so by making them "rev"-type contributors in the Authors@R field (with their consent). More info on this here.

Welcome aboard! We'd also love a blog post about your package, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development. (https://ropensci.org/blog/). If you are interested, @stefaniebutland will be in touch about content and timing.

We've started putting together a gitbook with our best practice and tips, this chapter starts the 3d section that's about guidance for after onboarding. Please tell us what could be improved, the corresponding repo is here.

MahShaaban commented 5 years ago

Thank you, everyone. I transferred the repo, fixed the ci links and will look into the other suggestions.

I'd like to acknowledge your contributions @maelle, @haozhu233, @seaaan, and @jeroen if you don't mind.

maelle commented 5 years ago

Awesome!

Don't acknowledge my contributions, as mentioned here "Please do not list editors as contributors. Your participation in and contribution to rOpenSci is thanks enough!" :wink: (we mean it!)

stefaniebutland commented 5 years ago

Hello @MahShaaban. Are you interested in writing a post about your package for the rOpenSci blog, either a short-form intro to it (https://ropensci.org/tech-notes/) or long-form post with more narrative about its development (https://ropensci.org/blog/)?

This link will give you many examples of blog posts by authors of onboarded packages so you can get an idea of the style and length you prefer: https://ropensci.org/tags/onboarding/.

Here are some technical and editorial guidelines for contributing a post: https://github.com/ropensci/roweb2#contributing-a-blog-post.

Please let me know what you think.

MahShaaban commented 5 years ago

Thanks, @stefaniebutland for this opportunity. I'd like to write a blog post about colocr. I will read the guides first and get back to you to discuss it.

stefaniebutland commented 5 years ago

@MahShaaban What do you think about setting a deadline to submit a draft post? I'm happy to answer any questions you might have.

MahShaaban commented 5 years ago

Hey @stefaniebutland. I certainly don't mind that. I read the guides you referred to earlier and I think I will go with a short post intro. The idea is to adapt parts of the vignette that explains the goal of the package and how it with examples. If this is okay, I will start right away.

stefaniebutland commented 5 years ago

If you're referring to a tech note (https://ropensci.org/technotes/), they don't require scheduling on a certain day of the week so please submit your draft when ready and I'll review it soon after.

adapt parts of the vignette that explains the goal of the package and how it with examples.

Sounds good. Make sure it's different enough from the vignette. Good if you can lay out one cool example of what you can do with the package, rather than giving several examples.

MahShaaban commented 5 years ago

Okay. Thanks @stefaniebutland.

stefaniebutland commented 5 years ago

@MahShaaban, @maelle just reminded me that this is your second package onboarding! I'm quite curious about package authors' motivations to submit multiple packages e.g. are there diminishing returns on author's effort on subsequent submissions?

I know you indicated you prefer to write a tech note about colocr, but if it interests you and you see value in it for yourself, I'd love to read a blog post that features colocr as you described, but also reflects on your experiences and motivation for onboarding multiple packages.

Zero obligation to do more than you suggested! πŸ˜„

MahShaaban commented 5 years ago

The truth is, I intended to write a blog post about this recent submission. The same happened the first time I submitted a package to ropensci. The reason I shied away from it is that I don't see how a detailed description of the package and the features could be different from the vignette! I am definitely willing to be educated on this. There might be different ways of writing or different aspects of the package that I should focus on while writing a blog post vs a package vignette. I think being familiar with the submission and review process helped a lot the second time around. So the second submission was easier in this sense. In both cases, I had a very positive experience. I think the reviews and the suggestions I received improved the packages.

stefaniebutland commented 5 years ago

Sorry @MahShaaban I think I misunderstood when I thought you wanted to write a tech note. Yes your idea for a blog post: " to adapt parts of the vignette that explains the goal of the package and how it with examples" sounds good.

I don't see how a detailed description of the package and the features could be different from the vignette!

I think the blog post differs from the vignette in that the post should tell a bit of a story. Unlike a vignette, it's an opportunity to give your personal perspective on the package, like something you learned, or some really strong motivation for creating it. Was it your first Shiny app? Do you have any general tips for packages with Shiny apps? This might make the post interesting for people outside your field. (Thanks to @maelle for suggesting this to me when I asked her for advice.) Do you know of other users of your package? And how do they use it? Any of those things could go in the post.

One of the big benefits of contributing a blog post is that it can get more eyes on your work. Once published, we tweet from rOpenSci to >20,000 followers and it gets picked up by R-weekly live and R-bloggers.

With that, would you like to set a deadline for submitting a draft via pull request? Technical and editorial guidelines: https://github.com/ropensci/roweb2#contributing-a-blog-post.

MahShaaban commented 5 years ago

Hey @stefaniebutland, I just submitted a PR with a first draft of the blog post, ropensci/roweb2#329. Please, let me know what you think.