danforthcenter / plantcv

Plant phenotyping with image analysis
Mozilla Public License 2.0
662 stars 265 forks source link

method to identify leaves outside field of view #371

Closed dschneiderch closed 5 years ago

dschneiderch commented 5 years ago

Mostly a use question and possible feature request. I am working on automatically filtering and qc'ing the results from my image analyses. I'm wondering if you guys had any way to detect when plants grow outside the field of view? A2-GoldStandard_RGB-2019-03-31 16_11_45-VIS0_greenness

Could, eg, I raise an error if the contour crosses x=0?

For a discussion:

For a new feature:

HaleySchuhl commented 5 years ago

HI @dschneiderch ,

Within plantcv.analyze_object there is data collected about whether the object is in bounds or not. I believe this is detecting when plants extend to touching the top of the image but his could be extended to horizontal bounds as well!

nfahlgren commented 5 years ago

That's correct, one of the observations in the analyze_object output is called in_bounds (I'm proposing to change it to object_in_frame in the next release). It is a True/False value. True = object completely in the field of view, False = object touches the border of the image on any side.

dschneiderch commented 5 years ago

Just looked through it. Any objections if I made a new function for this so I can call it independently of analyze_object?

def object_in_frame(img, obj):
    # Check is object is touching image boundaries (QC)
    if len(np.shape(img)) == 3:
        ix, iy, iz = np.shape(img)
    else:
        ix, iy = np.shape(img)
    size1 = ix, iy
    frame_background = np.zeros(size1, dtype=np.uint8)
    frame = frame_background + 1
    frame_contour, frame_hierarchy = cv2.findContours(frame, cv2.RETR_TREE,   c2.CHAIN_APPROX_NONE)[-2:]
    ptest = []
    vobj = np.vstack(obj)
    for i, c in enumerate(vobj):
        xy = tuple(c)
        pptest = cv2.pointPolygonTest(frame_contour[0], xy, measureDist=False)
        ptest.append(pptest)
    in_bounds = all(c == 1 for c in ptest)

    return(in_bounds)
nfahlgren commented 5 years ago

That would be okay. I wouldn't want to duplicate the code so if it's pulled out separately it could be called from within analyze_object to keep the current behavior in addition to being able to use it independently.

I can see some value to having it separate for QC where you might not yet have done object_composition or other steps yet. If it's just about easily accessing the value though, note that in the next release all observations collected with analyze_object and other functions will be accessible in a dictionary lookup pcv.outputs.observations["object_in_frame"]["value"].

The separate function would also need some test(s) and a documentation page.

dschneiderch commented 5 years ago

I don’t usually call analyze_object (most people here wouldn’t know what to do with all that info) and I was thinking that calling that function was a bit heavy handed if all I wanted was in_bounds. I could also just keep a function on my computer if you’d prefer not to break it up. I was thinking to use it before doing any of the analyses to save time.

nfahlgren commented 5 years ago

I can see it being useful when you just want to do a quick QC check.

dschneiderch commented 5 years ago

I'm working on tests. have not implemented a test before. Do you know if any/which of the test_img and test_contour are from the same image? It looks like I am not to use any pcv functions, otherwise I would copy test_plantcv_object_composition() and add pcv.within_frame(). Correct? Also, does there need to be a test for a True and a False result or would one suffice with assert(in_bounds is True or in_bounds is False) ?

dschneiderch commented 5 years ago

followup to what test data is available: are any of the TEST_INPUT_CONTOURS a single contour i.e don't need object_composition()?

HaleySchuhl commented 5 years ago

Hi @dschneiderch ,

Within the existing testing data you could use TEST_INTPUT_MULTI and TEST_INPUT_MULTI_OBJECT are for the same image. I believe the contour in TEST_INPUT_MULTI_OBJECT wouldn't require an object composition step. One example of a test that uses these inputs is test_plantcv_auto_crop.

Regarding the type of testing needed just depends on the function. We try to cover every line of code, so based on the object_in_fame function code you posted yesterday you'd need to cover the case where the image is grayscale and the case where the image is color to cover both portions of the if/else statement. Best practice is to have a separate test for every use case but we often concatenate tests (for example, testing debug='plot' and debug='print' in the same unit test) so you could always read the image in as grayscale and as color to cover your code. As you mentioned, it is definitely preferred not to use any other PlantCV functions within a unit test but this is something that we haven't been strict on (i.e. unit testing for morphology sub-package functions).

Hope that helps! Let us know if you have any other questions or concerns.

dschneiderch commented 5 years ago

It seems the easiest thing to do is to use the mask as the input so we only have to check with binary image. If you have an object it probably came from a binary image right?

dschneiderch commented 5 years ago

also, i noticed there were a couple TEST* items called INTPUT .... is this a typo?

HaleySchuhl commented 5 years ago

I agree, a mask would probably be the easiest input as it eliminates the need for the user to also input an image and the QC check could take place earlier in a workflow. I definitely think instances of INTPUT are likely typos haha!

dschneiderch commented 5 years ago

how do you run the tests? I tried python -m tests from my plantcv folder but it said No module named tests.main; 'tests' is a package and cannot be directly executed
Same when I tried python -m tests/test.py

HaleySchuhl commented 5 years ago

The command I run on the command line from my local PlantCV directory is pytest -v tests/tests.py -k test_keyword_here. For example pytest -v tests/tests.py -k bad_input will run every test that contains "bad_input" in the name of the test.

I also really like the ability to run coveralls locally so I ensure all lines of code are getting covered by unit tests, although there is no option to run a subset of tests when doing this (as far as I am aware). In the command line I run coverage run --source=plantcv setup.py test and then coverage report -m to view the coverage report.

dschneiderch commented 5 years ago

ok.... not very verbose but

(plantcv) C:\Users\dominikschneider\Documents\plantcv>pytest -v tests/tests.py -k within_frame
============================= test session starts =============================
platform win32 -- Python 3.6.8, pytest-4.0.1, py-1.7.0, pluggy-0.8.0 -- C:\Users\dominikschneider\Anaconda3\envs\plantcv\python.exe
cachedir: .pytest_cache
rootdir: C:\Users\dominikschneider\Documents\plantcv, inifile: setup.cfg
collecting ...
(plantcv) C:\Users\dominikschneider\Documents\plantcv>
nfahlgren commented 5 years ago

We were just thinking, instead of using contours (the point polygon test has to iterate over every contour point), what if you just used NumPy and the binary mask to check whether the mask edges are black or not? I think it would be faster when there are more complex contours.

Just a code snippet:

height, width = np.shape(mask)

# First column
first_col = mask[:, 0]

# Last column
last_col = mask[:, width - 1]

# First row
first_row = mask[0, :]

# Last row
last_row = mask[height - 1, :]

edges = np.concatenate([first_col, last_col, first_row, last_row])

out_of_bounds = bool(np.count_nonzero(edges))
dschneiderch commented 5 years ago

ha! I already did that and then didn't want to change too much - not sure what lays downstream!

i'll go back to that because I seem to have broken analyze_object and within_frame isn't working well. maybe i uncovered a small bug for a special use case? Does obj get changed inside the within_frame function because if I give the functions below c1[0] they run fine. within_frame is almost an exact copy of what you had.

The pink image is from find_objects, blue outline is from object_composition, then you can see the output from analyze_object is None, None, None. And then calling within_frame also throws an error with the objects.

c,h = pcv.find_objects(img, mask)
c1=pcv.object_composition(img, c, h)
pcv.analyze_object(img, c1, mask)
pcv.within_frame(mask,c1)

image

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-39-7e9b46b589f6> in <module>
----> 1 pcv.within_frame(mask,c1)

~\Anaconda3\envs\plantcv\lib\site-packages\plantcv-3.2.0+107.gc0d09d5.dirty-py3.6.egg\plantcv\plantcv\within_frame.py in within_frame(mask, obj)
     29     frame_contour, frame_hierarchy = cv2.findContours(frame, cv2.RETR_TREE,  cv2.CHAIN_APPROX_NONE)[-2:]
     30     ptest = []
---> 31     vobj = np.vstack(obj)
     32     for i, c in enumerate(vobj):
     33         xy = tuple(c)

~\Anaconda3\envs\plantcv\lib\site-packages\numpy\core\shape_base.py in vstack(tup)
    232 
    233     """
--> 234     return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
    235 
    236 def hstack(tup):

ValueError: all the input arrays must have same number of dimensions
dschneiderch commented 5 years ago

presumably the function is getting tripped at L35 since len(c1) = 2.

i guess this is a bit of a contrived example because I didn't put any effort into making a good mask. But this case study should probably be covered?

nfahlgren commented 5 years ago

It's because the input obj to analyze_object has to be from the output of object_composition, it has to be a flattened contour. But you don't need this right? You could just read in the image, make a mask somehow and then run within_frame correct?

dschneiderch commented 5 years ago

doh! I had:

c1=pcv.object_composition(img, c, h)
pcv.analyze_object(img, c1, mask)

instead of

c1,m1=pcv.object_composition(img, c, h)
pcv.analyze_object(img, c1, m1)

But I will modify the function to just use the mask.

dschneiderch commented 5 years ago

Just finishing up tests and learned that input_mask.png in tests/data has more than 2 values do you expect this? I'm seeing now you have a separate TEST_INPUT_BINARY.... I guess there is no reason within_frame needs to get a binary mask since from cv2's perspective masks are zero and non-zero.

I had included a binary check at the top:

    if len(np.shape(mask)) > 2 or len(np.unique(mask)) > 2:
        fatal_error("Mask should be a binary image of 0 and nonzero values.")

too much?

dschneiderch commented 5 years ago

another question: The only way I can get analyze_object to run is if I use:

in_bounds = within_frame.within_frame(mask)

I have from plantcv.plantcv import within_frame at the top.

How come plot_image can just be called with plot_image?

nfahlgren commented 5 years ago

input_mask.png should be binary, not sure why it has other values. I think it's reasonable to have a binary check. As you said it doesn't technically need to be binary but since a mask is the input it should be.

nfahlgren commented 5 years ago

Since we want to import within_frame in analyze_object the import statement from plantcv.plantcv.within_frame import within_frame in plantcv/plantcv/__init__.py needs to be listed before the analyze_object import.