Closed panovr closed 6 years ago
Just as a note: The link to issue 465 is broken in your comment. I can c&p the text though.
@hanslovsky fixed
Thanks so much for working on this! It is looking really, really good!
Three main concerns:
If you can address the above three things, I am happy.
I notice that this PR covers hough line transform. Are there plans for hough circle transform?
@ctrueden I really appreciate for the detailed code review, thanks!
FIX:
imglib2-algorithm-gpl
is need for the PickImagePeaks
class in HoughTransform.java at this lineTODO:
HELP:
public void setSuppression( final double[] r )
in class PickImagePeaks.java
of file PickImagePeaks after the vote space image has been computed if I want to invoke the method public ArrayList< long[] > getPeakList()
at this line @tinevez the imglib2-algorithm-gpl is need for the PickImagePeaks class
OK. If the implementation would be more complicated without using this class, then I propose to add these Hough classes instead to imglib2-algorithm-gpl, to avoid circular dependency, and to have correct, legal licensing.
I will complement unit tests. Any example I can follow?
There are tons of unit tests across many components. We use JUnit. You could look at e.g. DilationTest
in this repository to see how Jean-Yves tests the morphological dilate operation.
What exactly is the PickImagePeaks
class used for in this case? Is it simply to find local extrema? You could use LocalExtrema
instead.
@panovr Did you try with LocalExtrema
yet?
@hanslovsky @ctrueden I will investigate LocalExtrema
later.
I am taking care of the Hough circle transform.
@tinevez OK, thank you very much!
@tinevez are you planning to migrate your work in https://github.com/tinevez/CircleSkinner/tree/master/src/main/java/net/imagej/circleskinner/hough to imglib2-algorithm
?
Yes, before end of June.
@tinevez I have a new undergrad student, @gselzer, working on ImageJ Ops and ImgLib2 stuff now. If there is anything he could take off your plate relating to the Hough transform project, just let us know!
Christ I am late everywhere. @gselzer can we chat over gitter sometime about this?
@tinevez I'm sorry that I did not see this comment before, but I am just now beginning to move onto this project and would love to chat on Gitter!
See also imagej/imagej-ops#526.
@panovr did you have a look at LocalExtrema
and whether or not that could (partially) replace PickImagePeaks
?
@hanslovsky FYI, @panovr has moved on from LOCI (back to China, where has main project is machine learning classification and segmentation of bird species). But our student @gselzer could (will! :wink:) work on completing this PR.
Thanks for the update @ctrueden
@hanslovsky @ctrueden Fixes:
LocalExtrema
instead of PickImagePeaks
All commits build and all tests pass on each commit
@hanslovsky @tpietzsch @tinevez After going through this PR with @gselzer and cleaning things up, we are now reasonably satisfied with it. The dependency on GPL code was purged.
It is questionable whether various calculation methods should really be returning boolean
to indicate success. What do you think? Any further changes you'd like to see before merging this?
@ctrueden @gselzer I haven't had time to go through the code yet but hopefully will be able do so this week. I agree that reporting success through boolean
return value is probably not necessary but more once I had a closer look.
I finally had some time to go over the code. I feel that this PR needs some work before we can merge it. These are my (itemized) thoughts:
RandomAccessibleInterval
should be used as parameter over Img
.RandomAccessibleInterval
allocated by the caller. That way, OutputAlgorithm.getResult
becomes obsolete for this class and static methods can be used as suggested above.Benchmark
peaks in the votespace should not be List< long[] >
but List< Point >
instead:
List< long[] >
creates unnecessary copies, andlong[]
array is not cloned before adding, i.e. all maxima are the same.This will make HoughTransform.pickPeaks
trivial/obsolete. As there are several bugs in this method (cf comments in diff), this seems like a good idea.
ExecutorService
or related classes/methods to be passed. Parallelization can be implemented on top of single-threaded basic functionality (if desired at all).findLocalExtrema( RandomAccessibleInterval, LocalNeighborhoodCheck, ExecutorService )
. This is the only method that requires ExecutorService
and should be replaced by one of the more recent LocalExtrema.findLocalExtrema
signatures that do not provide parallelization.Thanks @hanslovsky! @gselzer Please make the changes when you get a chance.
@hanslovsky I agree that the votespace peaks could be List<Point>
instead of List<long[]>
. I fail to understand however how changing this would make HoughTransform.pickPeaks()
obsolete, unless you are proposing moving the peak picking to another location. I think that going forward with this would certainly trim the method down but it should not be obsolete. Can you elaborate on this?
@gselzer With List<Point>
instead of List<long[]>
HoughTransform.pickPeaks
would be reduced to a single line (or two, considering that we have a generic T
instead of IntType
):
LocalExtrema.findLocalExtrema( voteSpace, new LocalExtrema.MaximumCheck<>( new IntType( 1 ) ) );
In general, users might want to use their own ways to detect local extrema, e.g. to minimize noise but your are right that it is probably best to leave the pickPeaks
method in place for users who do not.
@hanslovsky @ctrueden I have fixed all of the comments above, either in their own commit or in the larger commit where I refactored the algorithm to use static methods instead of stateful classes (I would like to fix the git history at some point as well). Let me know any further comments you might have.
I massaged the old votespace regression test so that it now passes. I am currently writing a peak finder regression test, since such a test was never written, and the results are looking promising. The only problem that I have is that the width of the vote space is over the wrong interval; currently this interval is [0, maxRho * 2), whereas we would like [-maxRho, maxRho) to be the interval. Is there a way to change this interval?
My solution to the interval problem was by creating a translated IntervalView
within PickPeaks
, which allows us to return the correct polar coordinates for our line without a) automatically mutating the input at any point or b) requiring the user to do so before getting the peaks from the vote space. As for now I do not see any reason to make this optional, as without this translation PickPeaks
will not return the correct polar coordinate results, however if there is a reason to make this optional I can do so.
I also added static methods to generate the slope and intercept from the polar coordinate output, as well as tests for all relevant methods.
Let me know if there are any issues.
@hanslovsky Regarding your comment about avoiding ImageIO
in favor of a hand-designed ground truth: it would be helpful to have some clarification about what the concerns are. Is it the usage of a javax
class in the test scope? Are you concerned about the test being incompatible with some JVMs? Or is there some other reason you do not like the result image saved as a resource? The alternative—to embed a large primitive array in the test class—seems no better to me, as it would be larger on disk.
@ctrueden My issues with the current test are two-fold:
javax
and awt
dependencies. imglib2 is an image processing library and not an image io/ui library, and therefore should avoid dependencies on ui framework libraries. This is my personal oppinion and maybe the @imglib/admins should comment on that.@hanslovsky I do not know if this changes your opinion on the matter however the size of the test image is 50 x 50, not 501 x 501 (the 50s are being specified as long
, which might explain your confusion). With that being said we could refactor this test to a non-square input image if that would solve any issues.
Thanks for the clarification @gselzer I still believe that the ground truth for the test should be hand crafted within the code for reproducability. If we don't know how the test ground truth was generated we will not be able to know if the test fails because the algorithm is wrong or because we messed up something in the test.
I suggest:
@hanslovsky I will rewrite the tests such that we test against a smaller, non-square image with 2+ lines. I agree that this will create a better test. However I still do not understand what you mean by a "hand-crafted" ground truth. My understanding of this (and correct me if I am wrong) is that you desire a way in the code of determining what the value should be at a particular pixel instead of relying on an image. This is understandable, but because there is really no equation that describes these vote spaces the only options that I see here are:
byte[]
within the testI do not really know which of these would be the right option.
@gselzer Yes, you understood me correctly there. The reason that I would like to see this is that the current test image was probably created via the tested algorithm itself and we thus don't have any means of verifying that the algorithm actually worked correctly. While it is true that we cannot generate the ground truth via a closed form equation, the vote space is still generated in a deterministic way. I would even prefer a pixel-by-pixel (for non-trivial pixels) initialization of the vote space, something like this:
RandomAccess voteAccess;
voteAccess.setPosition( nonZeroPosition );
voteAccess.get().set( nonZeroCount );
For the current parameter setting, 2468 out of a total of 25380 pixels are non-zero. That is obviously way too much to generate by hand. If we can get the number of non-zero pixels to down to a maintainable amount by reducing the vote space size and image size, we should do that. If you think that doing this is too much work we should just abandon that test and opt for a higher level test on the peaks only (as described above). Otherwise we would just run something like this:
int doSomeComputation();
assertEquals( doSomeComputation(), doSomeComputation() );
@hanslovsky
The reason that I would like to see this is that the current test image was probably created via the tested algorithm itself and we thus don't have any means of verifying that the algorithm actually worked correctly.
Why is it that we do not know that the algorithm did not work correctly? It is not difficult to take the output point(s) of the algorithm, convert them into the line(s) that they refer to, and check to see if it is(they are) similar to that of the input. For example, to generate the current test image pixels were set to 255 according to the equation y = 24
. We have verified that the algorithm works correctly since the only maximum returned from this algorithm is at (rho, theta) = (24, pi/2)
, which are the correct (rho, theta)
coordinates for such a line. I guess I am confused as to what more is required to make sure that the algorithm is working correctly.
While it is true that we cannot generate the ground truth via a closed form equation, the vote space is still generated in a deterministic way. I would even prefer a pixel-by-pixel (for non-trivial pixels) initialization of the vote space
I agree with this. However I do not understand the method by which you want to generate these values. Even with a, for example, 11x13 the smallest test image that still returns accurate data will have no less than thousands of pixels. This is way too much to calculate by hand.
If you think that doing this is too much work we should just abandon that test and opt for a higher level test on the peaks only (as described above).
I would argue against this. The votespace includes a lot more information than just the peaks (it is important to make sure that all non-zero pixels are contained within the general flow of the space) and I think it is important to test for this too.
@gselzer
Why is it that we do not know that the algorithm did not work correctly? It is not difficult to take the output point(s) of the algorithm, convert them into the line(s) that they refer to, and check to see if it is(they are) similar to that of the input. For example, to generate the current test image pixels were set to 255 according to the equation y = 24. We have verified that the algorithm works correctly since the only maximum returned from this algorithm is at (rho, theta) = (24, pi/2), which are the correct (rho, theta) coordinates for such a line.
In general I do agree and this is what I was referring to with my more high-level test that does not look at the vote space and only checks if the correct parameters are obtained.
However, this is not what is happening in testHoughLineTransformToTarget
. There, the check happens on the vote space, and the vote space is generated from result8.png
. It is unclear how result8.png
was generated and I can only assume it was generated by the same algorithm it is supposed to test. As stated above, testing an algorithm with a result that was produced by the same algorithm does not make any sense.
I agree with this. However I do not understand the method by which you want to generate these values. Even with a, for example, 11x13 the smallest test image that still returns accurate data will have no less than thousands of pixels. This is way too much to calculate by hand.
For testing the vote space we are not worried about an accurate re-construction of the parameters but only that vote space gets populated correctly given a parameter precision/bin size (that is the only task of voteLines
). Pick a small enough vote space via voteLines( RandomAccessibleInterval< T >, RandomAccessibleInterval< U >, int, int, T )
so you can populate a ground truth example by hand. Yes, that would require that you calculate the counts for all non-zero bins by hand. If that is too much work (which I can understand), drop the test for the vote space count. The higher-level check for correct reconstruction of the lines will have to suffice.
As stated above, testing an algorithm with a result that was produced by the same algorithm does not make any sense.
On the contrary: such regression tests are vital to ensure that the algorithm continues to work reproducibly in the future. It does not guarantee correctness (but then again: does anything?), but it does verify consistency.
On the contrary: such regression tests are vital to ensure that the algorithm continues to work reproducibly in the future. It does not guarantee correctness (but then again: does anything?), but it does verify consistency.
Only if we can comprehend how the ground truth was generated, otherwise a future error will be rather useless. I still don't see how an appropriate example cannot be hand crafted.
In general I do agree and this is what I was referring to with my more high-level test that does not look at the vote space and only checks if the correct parameters are obtained.
Ah okay now I understand what you mean by a high-level test. This is what testPickPeaks
currently does; it takes the same image used in testHoughLineTransformToTarget
, runs the transform and then pickLinePeaks
on the resulting votespace, and then compares the resulting Point
s to what is expected.
Only if we can comprehend how the ground truth was generated, otherwise a future error will be rather useless. I still don't see how an appropriate example cannot be hand crafted.
I think that the point @ctrueden and I are trying to make here is that this test has less to do with assuring the correctness of the algorithm and more to do with guaranteeing reproducability. This test will fail if any math changes and as such a success from this test is a sign that an output after an update to the algorithm is the same output that would have been given before. With that being said you are right in that it is not an indicator that the algorithm is working correctly; that is what testPickPeaks
is designed to do. We can change the name of this test if that would help make it more clear that reproducability is the goal.
@gselzer I still think that a hand-crafted test would do the same thing and, on top of that, would add documentation of what is being tested. But @ctrueden is an imglib2 admin and should have the last word on this.
The build currently fails with JavaDoc errors. Most of those errors are unrelated and fixed in master. I locally merged with master and the only remaining JavaDoc error is this:
[ERROR] /path/to/imglib2-algorithm/src/main/java/net/imglib2/algorithm/hough/HoughTransforms.java:325: error: reference not found
[ERROR] * vote space as well. Thus if {@link HoughTransforms#pickPeaks} is not used
This is an easy fix, just repleace pickPeaks
with pickLinePeaks
. To confirm that JavaDoc generation does not fail you can run mvn javadoc:javadoc
(warnings can be ignored).
@gselzer Once you update
[x] my last requested change,
this looks good for a merge to me. Yours and @ctrueden's argument regarding the test case makes sense and I agree that we should leave it as is now.
@hanslovsky I just fixed the error and the requested change in the most recent commit. Thanks for taking all the time and effort to get this project ready to merge!
@gselzer I looked over it once more and it seems good to me. Thank you for putting the work in and bearing with me and all my pedantic requests. I will leave this open for a little bit longer (probably until the end of the week) to see if anyone else raises concerns, and then I will merge.
Good work!
@imglib/admins What is the appropriate merge strategy? Squash and merge?
@hanslovsky I personally prefer "Create a merge commit" to preserve the history, assuming each commit compiles with passing tests. Otherwise, what is the point of making clean topic branches? That said, I am not an ImgLib admin—see Governance summary—so ultimately it is up to @tpietzsch and @axtimwalde and @StephanPreibisch.
@ctrueden I was advised for my own (single-feature addition) PRs to rebase before a merge but I do not want to make this decision here.
I was advised for my own (single-feature addition) PRs to rebase before a merge
Doing that will do a fast-forward merge, which loses the fact that it was ever a branch. This is a bad thing which has caused social confusion and resentment in the past, where fingers were pointed and people were accused of "not working on a branch" even though they did at the time. I strongly encourage the ImgLib2 admins to use "Create a merge commit." This can still be combined with a rebase first, if you want the topic branch to sit on top the latest master. The goal is to avoid fast forward merges.
@ctrueden sorry for the confusion. I was advised to rebase my branch, then the PR would be merged with a commit message. That was before github had options to rebase through the web interface.
This pull request is to port the Hough transform from ImgLib1 to ImgLib2, and also see this issue: https://github.com/imagej/imagej-ops/issues/465