masonearles / 3DLeafCT

Random forest segmentation of 3D leaf microCT images
4 stars 2 forks source link

"Hard to label" stacks (e.g. curved / irregular leaf, curved tape) - 3D instead of 2D training? #3

Closed gtrancourt closed 6 years ago

gtrancourt commented 6 years ago

This is more an opened question and maybe future to-do, but my first try with the script was a hard to label image I feel. I ended up training the algorithm on 30 contiguous slices and testing it on 10 contiguous slices (this took almost 4 hours), using cells, air, veins, tape, and the outside of the leaf as labels. The training ended up being quite good, as below:

image Training labelled image

image Predicted

However, after running the trained algorithm on the whole stack, I still ended up not so good labelling, like these: image image Tape leaking into the cells and veins

image Cells leaking into the tape and outside of leaf as air

image Mid-veins are ok

Just looking at the veins seems to be an appropriate result given the minimal effort I have to give to get this. So my question is, is it possible to train the algorithm in 3D (i.e. looking at all surrounding voxels) instead of training it on specific images. This would probably increase the computing time, but would benefit the labelling of harder to measure stacks. If you look at the mid-vein image above, there's a few interruptions in the vein in the middle. I don't know how to do this or how difficult it would be to implement it.

mattjenkins3 commented 6 years ago

I think Mason will want to weigh in on this as well, but we are considering several methods for post-processing of the predicted stack, to improve segmentation accuracy. One method would involve using an algorithm to build a 2d mesh over the surfaces of 3d objects like veins, thereby 'smoothing' the edges. Comparing the position of predicted voxels that fall inside and outside of the 2d mesh could then be used to 'correct' inaccurate predictions at boundaries such as those between vein/bundle sheath and mesophyll tissue. The misclassification of tape has been a persistent issue, and thanks for highlighting it here. As Mason pointed out in his email, segmenting the tape as a separate class might be a great way to avoid this issue, though it does not seem to have worked very well here. We are using a filter for distance from edge to help classify epidermis, but I've also noticed this method breaks down when leaf edges are not close to parallel with the image borders.

Any ideas for how we could train on a 3d dataset?

masonearles commented 6 years ago

Regarding the 'curvy leaf' issue. We could try adding another feature layer that takes into account the relative x and y coordinate positions, as opposed to just y-coordinate position that we're currently doing. This should be somewhat helpful if there is consistency in where the epidermis is located through the stack for a given x-y coordinate. Other than that, we could try breaking the image up into several parts with relatively straight epidermal boundaries, training, and then putting them back together. Again, this would work best if there's consistency in the epidermal position through the stack for a given x-y coordinate.

Regarding 3D training, I think this would end up being a very challenging, and probably more to the point, a very computationally expensive task. This is especially true if you are trying to train on all slices...plus, this would mean that you already have the entire leaf classified, right? Alternatively, you can derive new 2D feature layers using 3D information. For example, many 3D image filters result in a 2D representation that could be used as a single feature layer. The problem here is that as the convolution kernel dimension gets large enough to usefully capture the features of interest (say 20^3 voxels), the processing time becomes very long. It might be worth trying, but I'm not sure.

gtrancourt commented 6 years ago

Since I have no knowledge of how the algorithm works at the moment, I asked this naive question knowing that it would be computationally expensive. However, I don't think the whole leaf would have to be labelled, just sets of 3 slices so that the training can be done on the 6 right angle neighbouring voxels instead of 4 (if this is how the algorithm works).

But as you pointed out Mason, there should be consistency in the position of the epidermis, which I didn't know of. So, when writting the 'manual' to this algorithm (i.e. a apaper?), the pre-processing part will be important, including do's-and-don'ts. My case was probably a "let's think about it for a second" case. It would help if the algorithm would be slightly less sensitive to having angles and curvatures in the epidermis, but then again, the bottom epidermis was well identified. So the issue is which the tape and non-parallel top epidermis if I understand correctly.

I do like the idea of post-processing as a way to correct for "mistakes", such as air-filled vessels in the veins, and a smoothing out of the veins. The 2D-mesh seems a good idea. Also, maybe post-processing could correct for inconsistencies such as veins outside the leaf, or tape inside the leaf. That could probably be easily done by assigning some rules like 'tape should not be separated by air' or the like. The post-processing could actually be an independent function.

Anyway, my 2-cents for today!