Add Hough transform - Githubissues

panovr commented 7 years ago

This pull request is to port the Hough transform from ImgLib1 to ImgLib2, and also see this issue: https://github.com/imagej/imagej-ops/issues/465

hanslovsky commented 7 years ago

Just as a note: The link to issue 465 is broken in your comment. I can c&p the text though.

panovr commented 7 years ago

@hanslovsky fixed

ctrueden commented 7 years ago

Thanks so much for working on this! It is looking really, really good!

Three main concerns:

Extra dependencies are undesirable. I did not see where you actually need them? Probably they can just be removed?
Style formatting would be nice to match with the rest of ImgLib2. There are Eclipse formatting rules for this, but maybe you want to make IntelliJ version of them and commit it to ImgLib2 repository, to help others who use IntelliJ?
We need unit tests.

If you can address the above three things, I am happy.

tinevez commented 7 years ago

I notice that this PR covers hough line transform. Are there plans for hough circle transform?

panovr commented 7 years ago

@ctrueden I really appreciate for the detailed code review, thanks!

FIX:

the undesirable dependencies exist because I write a test class for validating. I should delete these.
the imglib2-algorithm-gpl is need for the PickImagePeaks class in HoughTransform.java at this line
I format code using default IntelliJ code format, and I will re-format code to ImgLib2 style.
I will complement unit tests. Any example I can follow? @ctrueden

TODO:

Add a method for generate lines coordinates based on vote images
Add hough circle transform or need others contribution @tinevez

HELP:

I still don't know how to use the method public void setSuppression( final double[] r ) in class PickImagePeaks.java of file PickImagePeaks after the vote space image has been computed if I want to invoke the method public ArrayList< long[] > getPeakList() at this line @tinevez

ctrueden commented 7 years ago

the imglib2-algorithm-gpl is need for the PickImagePeaks class

OK. If the implementation would be more complicated without using this class, then I propose to add these Hough classes instead to imglib2-algorithm-gpl, to avoid circular dependency, and to have correct, legal licensing.

I will complement unit tests. Any example I can follow?

There are tons of unit tests across many components. We use JUnit. You could look at e.g. DilationTest in this repository to see how Jean-Yves tests the morphological dilate operation.

hanslovsky commented 7 years ago

What exactly is the PickImagePeaks class used for in this case? Is it simply to find local extrema? You could use LocalExtrema instead.

ctrueden commented 7 years ago

@panovr Did you try with LocalExtrema yet?

panovr commented 7 years ago

@hanslovsky @ctrueden I will investigate LocalExtrema later.

tinevez commented 7 years ago

I am taking care of the Hough circle transform.

ctrueden commented 7 years ago

@tinevez OK, thank you very much!

imagejan commented 7 years ago

@tinevez are you planning to migrate your work in https://github.com/tinevez/CircleSkinner/tree/master/src/main/java/net/imagej/circleskinner/hough to imglib2-algorithm?

tinevez commented 7 years ago

Yes, before end of June.

ctrueden commented 7 years ago

@tinevez I have a new undergrad student, @gselzer, working on ImageJ Ops and ImgLib2 stuff now. If there is anything he could take off your plate relating to the Hough transform project, just let us know!

tinevez commented 7 years ago

Christ I am late everywhere. @gselzer can we chat over gitter sometime about this?

gselzer commented 7 years ago

@tinevez I'm sorry that I did not see this comment before, but I am just now beginning to move onto this project and would love to chat on Gitter!

ctrueden commented 6 years ago

See also imagej/imagej-ops#526.

hanslovsky commented 6 years ago

@panovr did you have a look at LocalExtrema and whether or not that could (partially) replace PickImagePeaks?

ctrueden commented 6 years ago

@hanslovsky FYI, @panovr has moved on from LOCI (back to China, where has main project is machine learning classification and segmentation of bird species). But our student @gselzer could (will! :wink:) work on completing this PR.

hanslovsky commented 6 years ago

Thanks for the update @ctrueden

gselzer commented 6 years ago

@hanslovsky @ctrueden Fixes:

use LocalExtrema instead of PickImagePeaks
format to imglib standards

All commits build and all tests pass on each commit

ctrueden commented 6 years ago

@hanslovsky @tpietzsch @tinevez After going through this PR with @gselzer and cleaning things up, we are now reasonably satisfied with it. The dependency on GPL code was purged.

It is questionable whether various calculation methods should really be returning boolean to indicate success. What do you think? Any further changes you'd like to see before merging this?

hanslovsky commented 6 years ago

@ctrueden @gselzer I haven't had time to go through the code yet but hopefully will be able do so this week. I agree that reporting success through boolean return value is probably not necessary but more once I had a closer look.

hanslovsky commented 6 years ago

I finally had some time to go over the code. I feel that this PR needs some work before we can merge it. These are my (itemized) thoughts:

Using stateful classes/objects seems suboptimal to me here. We are not interested in the state/do not change the state over time but run the computation once. Instead, static methods for vote space generation should be implemented. Ideally, split up vote space generation and maxima localization. Not using stateful classes/objects would also avoid variables not being initialized inside the constructor (most of the times, RAII is desirable).
RandomAccessibleInterval should be used as parameter over Img.
Instead of passing an image factory to allocate vote space, pass a RandomAccessibleInterval allocated by the caller. That way, OutputAlgorithm.getResult becomes obsolete for this class and static methods can be used as suggested above.
This class is not a benchmark and therefore should not implement Benchmark
final variables should be declared final
peaks in the votespace should not be List< long[] > but List< Point > instead:
- Converting to List< long[] > creates unnecessary copies, and
- introduces a bug because the long[] array is not cloned before adding, i.e. all maxima are the same.
This will make HoughTransform.pickPeaks trivial/obsolete. As there are several bugs in this method (cf comments in diff), this seems like a good idea.
Do not use any parallelization for the basic implementation, i.e. do not require any ExecutorService or related classes/methods to be passed. Parallelization can be implemented on top of single-threaded basic functionality (if desired at all).
Avoid the use of deprecated classes, in particular findLocalExtrema( RandomAccessibleInterval, LocalNeighborhoodCheck, ExecutorService ). This is the only method that requires ExecutorService and should be replaced by one of the more recent LocalExtrema.findLocalExtrema signatures that do not provide parallelization.

ctrueden commented 6 years ago

Thanks @hanslovsky! @gselzer Please make the changes when you get a chance.

gselzer commented 6 years ago

@hanslovsky I agree that the votespace peaks could be List<Point> instead of List<long[]>. I fail to understand however how changing this would make HoughTransform.pickPeaks() obsolete, unless you are proposing moving the peak picking to another location. I think that going forward with this would certainly trim the method down but it should not be obsolete. Can you elaborate on this?

hanslovsky commented 6 years ago

@gselzer With List<Point> instead of List<long[]> HoughTransform.pickPeaks would be reduced to a single line (or two, considering that we have a generic T instead of IntType):

LocalExtrema.findLocalExtrema( voteSpace, new LocalExtrema.MaximumCheck<>( new IntType( 1 ) ) );

In general, users might want to use their own ways to detect local extrema, e.g. to minimize noise but your are right that it is probably best to leave the pickPeaks method in place for users who do not.

gselzer commented 6 years ago

@hanslovsky @ctrueden I have fixed all of the comments above, either in their own commit or in the larger commit where I refactored the algorithm to use static methods instead of stateful classes (I would like to fix the git history at some point as well). Let me know any further comments you might have.

I massaged the old votespace regression test so that it now passes. I am currently writing a peak finder regression test, since such a test was never written, and the results are looking promising. The only problem that I have is that the width of the vote space is over the wrong interval; currently this interval is [0, maxRho * 2), whereas we would like [-maxRho, maxRho) to be the interval. Is there a way to change this interval?

gselzer commented 6 years ago

My solution to the interval problem was by creating a translated IntervalView within PickPeaks, which allows us to return the correct polar coordinates for our line without a) automatically mutating the input at any point or b) requiring the user to do so before getting the peaks from the vote space. As for now I do not see any reason to make this optional, as without this translation PickPeaks will not return the correct polar coordinate results, however if there is a reason to make this optional I can do so.

I also added static methods to generate the slope and intercept from the polar coordinate output, as well as tests for all relevant methods.

Let me know if there are any issues.

ctrueden commented 6 years ago

@hanslovsky Regarding your comment about avoiding ImageIO in favor of a hand-designed ground truth: it would be helpful to have some clarification about what the concerns are. Is it the usage of a javax class in the test scope? Are you concerned about the test being incompatible with some JVMs? Or is there some other reason you do not like the result image saved as a resource? The alternative—to embed a large primitive array in the test class—seems no better to me, as it would be larger on disk.

hanslovsky commented 6 years ago

@ctrueden My issues with the current test are two-fold:

Using a ground-truth image hides how the image was generated. With a large image like that it seems to me the image itself was created with the algorithm it is supposed to test. A smaller, hand-crafted test-case would be more useful:
- Out of the 501x501 input image, only a very small subset is used -> the input size can be reduced drastically
- For a smaller input, the ground truth image size will be smaller. That size can be further reduced with a coarser choice of the parameters. With a smaller ground truth image, the ground truth image can be populated within the code (pixel-by-pixel for all non-zero pixels).
The test class introduces javax and awt dependencies. imglib2 is an image processing library and not an image io/ui library, and therefore should avoid dependencies on ui framework libraries. This is my personal oppinion and maybe the @imglib/admins should comment on that.

gselzer commented 6 years ago

@hanslovsky I do not know if this changes your opinion on the matter however the size of the test image is 50 x 50, not 501 x 501 (the 50s are being specified as long, which might explain your confusion). With that being said we could refactor this test to a non-square input image if that would solve any issues.

hanslovsky commented 6 years ago

Thanks for the clarification @gselzer I still believe that the ground truth for the test should be hand crafted within the code for reproducability. If we don't know how the test ground truth was generated we will not be able to know if the test fails because the algorithm is wrong or because we messed up something in the test.

I suggest:

Make the test image and (especially) the vote space ground truth even smaller (and not square) such that construction by hand is feasible
Create a larger test image with (at least) 2 (non-continuous) straight lines and test at a higher level that you get exactly two (or n if you use more lines) peaks with the correct parameters

gselzer commented 6 years ago

@hanslovsky I will rewrite the tests such that we test against a smaller, non-square image with 2+ lines. I agree that this will create a better test. However I still do not understand what you mean by a "hand-crafted" ground truth. My understanding of this (and correct me if I am wrong) is that you desire a way in the code of determining what the value should be at a particular pixel instead of relying on an image. This is understandable, but because there is really no equation that describes these vote spaces the only options that I see here are:

storing a rather large byte[] within the test
using a rather large (though less so) hashmap.

I do not really know which of these would be the right option.

hanslovsky commented 6 years ago

@gselzer Yes, you understood me correctly there. The reason that I would like to see this is that the current test image was probably created via the tested algorithm itself and we thus don't have any means of verifying that the algorithm actually worked correctly. While it is true that we cannot generate the ground truth via a closed form equation, the vote space is still generated in a deterministic way. I would even prefer a pixel-by-pixel (for non-trivial pixels) initialization of the vote space, something like this:

RandomAccess voteAccess;
voteAccess.setPosition( nonZeroPosition );
voteAccess.get().set( nonZeroCount );

For the current parameter setting, 2468 out of a total of 25380 pixels are non-zero. That is obviously way too much to generate by hand. If we can get the number of non-zero pixels to down to a maintainable amount by reducing the vote space size and image size, we should do that. If you think that doing this is too much work we should just abandon that test and opt for a higher level test on the peaks only (as described above). Otherwise we would just run something like this:

int doSomeComputation();
assertEquals( doSomeComputation(), doSomeComputation() );

gselzer commented 6 years ago

@hanslovsky

The reason that I would like to see this is that the current test image was probably created via the tested algorithm itself and we thus don't have any means of verifying that the algorithm actually worked correctly.

Why is it that we do not know that the algorithm did not work correctly? It is not difficult to take the output point(s) of the algorithm, convert them into the line(s) that they refer to, and check to see if it is(they are) similar to that of the input. For example, to generate the current test image pixels were set to 255 according to the equation y = 24. We have verified that the algorithm works correctly since the only maximum returned from this algorithm is at (rho, theta) = (24, pi/2), which are the correct (rho, theta) coordinates for such a line. I guess I am confused as to what more is required to make sure that the algorithm is working correctly.

While it is true that we cannot generate the ground truth via a closed form equation, the vote space is still generated in a deterministic way. I would even prefer a pixel-by-pixel (for non-trivial pixels) initialization of the vote space

I agree with this. However I do not understand the method by which you want to generate these values. Even with a, for example, 11x13 the smallest test image that still returns accurate data will have no less than thousands of pixels. This is way too much to calculate by hand.

If you think that doing this is too much work we should just abandon that test and opt for a higher level test on the peaks only (as described above).

I would argue against this. The votespace includes a lot more information than just the peaks (it is important to make sure that all non-zero pixels are contained within the general flow of the space) and I think it is important to test for this too.

hanslovsky commented 6 years ago

@gselzer

Why is it that we do not know that the algorithm did not work correctly? It is not difficult to take the output point(s) of the algorithm, convert them into the line(s) that they refer to, and check to see if it is(they are) similar to that of the input. For example, to generate the current test image pixels were set to 255 according to the equation y = 24. We have verified that the algorithm works correctly since the only maximum returned from this algorithm is at (rho, theta) = (24, pi/2), which are the correct (rho, theta) coordinates for such a line.

In general I do agree and this is what I was referring to with my more high-level test that does not look at the vote space and only checks if the correct parameters are obtained.

However, this is not what is happening in testHoughLineTransformToTarget. There, the check happens on the vote space, and the vote space is generated from result8.png. It is unclear how result8.png was generated and I can only assume it was generated by the same algorithm it is supposed to test. As stated above, testing an algorithm with a result that was produced by the same algorithm does not make any sense.

I agree with this. However I do not understand the method by which you want to generate these values. Even with a, for example, 11x13 the smallest test image that still returns accurate data will have no less than thousands of pixels. This is way too much to calculate by hand.

For testing the vote space we are not worried about an accurate re-construction of the parameters but only that vote space gets populated correctly given a parameter precision/bin size (that is the only task of voteLines). Pick a small enough vote space via voteLines( RandomAccessibleInterval< T >, RandomAccessibleInterval< U >, int, int, T ) so you can populate a ground truth example by hand. Yes, that would require that you calculate the counts for all non-zero bins by hand. If that is too much work (which I can understand), drop the test for the vote space count. The higher-level check for correct reconstruction of the lines will have to suffice.

ctrueden commented 6 years ago

As stated above, testing an algorithm with a result that was produced by the same algorithm does not make any sense.

On the contrary: such regression tests are vital to ensure that the algorithm continues to work reproducibly in the future. It does not guarantee correctness (but then again: does anything?), but it does verify consistency.

hanslovsky commented 6 years ago

On the contrary: such regression tests are vital to ensure that the algorithm continues to work reproducibly in the future. It does not guarantee correctness (but then again: does anything?), but it does verify consistency.

Only if we can comprehend how the ground truth was generated, otherwise a future error will be rather useless. I still don't see how an appropriate example cannot be hand crafted.

gselzer commented 6 years ago

In general I do agree and this is what I was referring to with my more high-level test that does not look at the vote space and only checks if the correct parameters are obtained.

Ah okay now I understand what you mean by a high-level test. This is what testPickPeaks currently does; it takes the same image used in testHoughLineTransformToTarget, runs the transform and then pickLinePeaks on the resulting votespace, and then compares the resulting Points to what is expected.

Only if we can comprehend how the ground truth was generated, otherwise a future error will be rather useless. I still don't see how an appropriate example cannot be hand crafted.

I think that the point @ctrueden and I are trying to make here is that this test has less to do with assuring the correctness of the algorithm and more to do with guaranteeing reproducability. This test will fail if any math changes and as such a success from this test is a sign that an output after an update to the algorithm is the same output that would have been given before. With that being said you are right in that it is not an indicator that the algorithm is working correctly; that is what testPickPeaks is designed to do. We can change the name of this test if that would help make it more clear that reproducability is the goal.

hanslovsky commented 6 years ago

@gselzer I still think that a hand-crafted test would do the same thing and, on top of that, would add documentation of what is being tested. But @ctrueden is an imglib2 admin and should have the last word on this.

hanslovsky commented 6 years ago

The build currently fails with JavaDoc errors. Most of those errors are unrelated and fixed in master. I locally merged with master and the only remaining JavaDoc error is this:

[ERROR] /path/to/imglib2-algorithm/src/main/java/net/imglib2/algorithm/hough/HoughTransforms.java:325: error: reference not found
[ERROR]          * vote space as well. Thus if {@link HoughTransforms#pickPeaks} is not used

This is an easy fix, just repleace pickPeaks with pickLinePeaks. To confirm that JavaDoc generation does not fail you can run mvn javadoc:javadoc (warnings can be ignored).

hanslovsky commented 6 years ago

@gselzer Once you update

[x] the JavaDoc to fix the error, and
[x] my last requested change,

this looks good for a merge to me. Yours and @ctrueden's argument regarding the test case makes sense and I agree that we should leave it as is now.

gselzer commented 6 years ago

@hanslovsky I just fixed the error and the requested change in the most recent commit. Thanks for taking all the time and effort to get this project ready to merge!

hanslovsky commented 6 years ago

@gselzer I looked over it once more and it seems good to me. Thank you for putting the work in and bearing with me and all my pedantic requests. I will leave this open for a little bit longer (probably until the end of the week) to see if anyone else raises concerns, and then I will merge.

Good work!

hanslovsky commented 6 years ago

@imglib/admins What is the appropriate merge strategy? Squash and merge?

ctrueden commented 6 years ago

@hanslovsky I personally prefer "Create a merge commit" to preserve the history, assuming each commit compiles with passing tests. Otherwise, what is the point of making clean topic branches? That said, I am not an ImgLib admin—see Governance summary—so ultimately it is up to @tpietzsch and @axtimwalde and @StephanPreibisch.

hanslovsky commented 6 years ago

@ctrueden I was advised for my own (single-feature addition) PRs to rebase before a merge but I do not want to make this decision here.

ctrueden commented 6 years ago

I was advised for my own (single-feature addition) PRs to rebase before a merge

Doing that will do a fast-forward merge, which loses the fact that it was ever a branch. This is a bad thing which has caused social confusion and resentment in the past, where fingers were pointed and people were accused of "not working on a branch" even though they did at the time. I strongly encourage the ImgLib2 admins to use "Create a merge commit." This can still be combined with a rebase first, if you want the topic branch to sit on top the latest master. The goal is to avoid fast forward merges.

hanslovsky commented 6 years ago

@ctrueden sorry for the confusion. I was advised to rebase my branch, then the PR would be merged with a commit message. That was before github had options to rebase through the web interface.

imglib / imglib2-algorithm

Add Hough transform #32