Tree segmentation improvements

spono commented 4 years ago

Sometimes it happens that the proposed segmentation algorithms segment objects that have no "ecological" coherence, meaning that pretty short ones might have too big crown areas. A (basic?) idea might be to use an allometric equation when R is not defined by user; in such way, the search radius should be flexible enough to allow more correct segmentation. I guess this paper from Pretzsch et al. might help.

Much more complicated might be the segmentation limitation using shape fitting, such as the enhancement recently proposed by Amiri et al or the corresponding treetop identfication method from Polewski et al. .

Jean-Romain commented 4 years ago

What is the actual request? To modify the algorithms? To add new functions? To provide post-processing tools? Please explain me what your are expecting exactly.

spono commented 4 years ago

I think they all fall within the improvement of actual tools, similar to the hmin you imposed to algs in order to avoid oversegmentation of short trees. In the same way (at least for what concerns the option concerning the search radius), I guess R could be limited automatically after evaluating the approximate height of the tree top.

Jean-Romain commented 4 years ago

In summary your request is more or less "hey, here some interesting papers, could you read them and consider implementing what is described in these documents?"

Sadly, without more collaboration from your part the answer is probably not (with a high probability). I no longer have time to spend weeks at implementing methods from scratch using literature. I can keep your references for later if one day I find some time to consider these studies but there are too many other papers to consider and only few hours per day.

If you really want me to implement tools from literature, I'm still really keen to spend time on it, but you must contribute on your side as well. For example by arguing on why it is interesting and explaining me how it works to help me to go through the papers faster or anything else that is a valuable contribution.

spono commented 4 years ago

sorry, I wrote too quickly but definitely not willing to leave you all the work to do: it was more a "thinking out loud" to start a discussion. My fault.

Flexible radius: the idea would be to use a function instead of a fixed radius. Literature is pretty wide and still debating on magic numbers for general allometry, because the approach seems quite species- (or functional group-) specific. Duncanson (page 9, bottom) defined a relationship such as 'H = β CR ^α' but no alpha & beta value ranges are given (see also Figure 1). Dummy ones could be built just to limit errors in the small trees, even using min/mean/max values reported in Pretzsch. Korpela (paragraph 2.4) reports that for some of them, the relation is even linear. Nevertheless, maybe letting a door open for user's in case of field measurements might help (even though I have to think how; I'll let you know if I come up with something).
paraboloid shape fitting: might be meant as a post-processing step following the first segmentation. To each cluster (treeID) a paraboloid fitting is applied, in order to check the presence of potential other tree peaks by looking for local maxima found within the point cloud. A series of paraboloids are fitted and classified until the partitioning stops because finds a set of non-overlapping paraboloids. It seems to increase of 10% segmentation quality but I don't know at which [real] computational cost. In addition, it si reported to work on normalised cut (hence Li 2012, if I remember correctly) but don't know if it might be applied to the other methods as well. I have no shame of admitting that these papers definitely get too far from my competences :)

jgrn307 commented 4 years ago

I can see some interesting ways of implementing this. One way would be to build the allometry into a parameter optimization wrapper around the itc algorithms -- basically set a height to crown area ratio range based on allometric relationships, both of which drop out of the ITC algorithms, and then run the algorithm with varying parameters until you get the maximum number of trees with "acceptable" height to crown area relationships. This would be largely agnostic to the ITC approach.

With that said, the height to crown area allometries are not all that common, and vary widely across different lifeforms and species.

Jean-Romain commented 4 years ago

About flexible radius what are we talking about? Individual tree detection (ITD) or individual tree segmentation (ITS)? If ITD it is already what lmf() provides isn't it?

spono commented 4 years ago

With that said, the height to crown area allometries are not all that common, and vary widely across different lifeforms and species.

True, and North America is advantaged on this front (e.g.). But on the other side could be created locally after field surveys.

Individual tree detection (ITD) or individual tree segmentation (ITS)? If ITD it is already what lmf() provides isn't it?

I guess using lmf() should work but for the search radius during segmentation, maybe exposing the parameters in order to let the user tweak the values in the function proposed by Popescu and Wynne. I think something like calling li2012(..., speed_up = lidR::lmf() ) and explaining that, if used, has to follow the aforementioned paper?

SIDE NOTE Release of v. 3.0.0 could be the occasion also for renaming speed_up (li2012) to max_cr (as in dalponte2016 and silva2016) and make params consistent among algs

Jean-Romain commented 4 years ago

I'm sorry but I hardy understand what you are trying to explain. What is the search radius during segmentation? To which algorithm are you referring? I don't understand what we are talking about. And what speed_up = lidR::lmf(), if created, could mean?

In lidR, algorithms are a strict (as far as possible) implementation of published papers. The first implication is that I won't make any modification to the current algorithms. I can only add new ones using a peer-reviewed reference. You mentioned Popescu and Wynne but lmf() is already a versatile implementation of Popescu and Wynne. What do you expect more?

Release of v. 3.0.0 could be the occasion also for renaming speed_up (li2012) to max_cr (as in dalponte2016 and silva2016) and make params consistent among algs

No. This is not the meaning of the parameter. The original paper does not have max crown and can create infinite trees. speed_up was introduced to speed up the algo that is not computable otherwise but its side effect is to create a maximum crown radius. This also is weak anyway because it is quadratically complex making it quickly uncomputable.

bi0m3trics commented 4 years ago

I'll just throw this out there as a possible clarifying point and @Jean-Romain can correct me if I'm wrong... but it seems like the items proposed here would be perfect for alternative functions/contributions in the lidRplugins package. It would just be a matter of forking it, starting a solution or implementation of some working code, and then seeing where it goes from there...

Then, after successful implementation and hopefully application, these modifications could be incorporated into later versions of lidR.

Jean-Romain commented 4 years ago

In lidRplugins I can put as many non peer-reviewed algorithms as I want. However I won't spend time at developing experimental algorithms just for fun. And I think it is not the question here. It seems to me that we are talking about what to do and not about how to do.

spono commented 4 years ago

For what concerns the paraboloid fitting: I have no knowledge about practical implementation of the Li2012 algorithm, but I thought that papers from Amiri and Polewski were suitable for integration as further refinement of an existing algorithm, due to the fact that are proposed for top-down segmentation algs applied directly to the pointcloud. If I misinterpreted and/or they cannot be integrated (easily or at all), the discussion is obviously closed. Not having appropriate programming skills, I tried to open a discussion on the topic with who can evaluate if the thing is feasible or not.

It's the same for the suggestion of adding a variable horizontal limit to the segmentation of small trees: yes, lmf() allows the detection of trees using a variable ws proportional to the tree height. But, when it comes to the segmentation side, the speed_up or max_cr parameter is taken into account: this mean that either for a tree of 30 metres or one of 5 metres, the potential points are those within the same radius. This potentially causes (due to different reasons) to have small trees with too wide crowns. An example comes from the regeneration areas: if many small ones are detected, then the segmentation will be pretty limited by the close distance of the individuals, but in the case of shrubby areas within which is growing a taller individual, shrubs will be taken as part of the canopy till speed_up or max_cr distance. Am I wrong? Here an example from the application of dalponte (the highlighted crown is approx. 360 sq metres for a tree 5.2 metres tall). overseg

BTW, if the package idea is to have a strict (as far as possible) implementation of published papers indipendently from possible improvements, then this discussion is closed too.

Jean-Romain commented 4 years ago

Ok so let me do a summary. Correct me if I misunderstood

Crown size stopping criterion

You are suggesting to add a criterion in each algorithm to constrain the crown sizes with tree heights.

In theory I could do that because it is not so much a big modification of the original algorithms. For example the user API could look like the following with fallometry a function that take the height of the tree as input.

algo <- algorithm(chm, ttops, max_cr = fallometry)

In silva2016 we already have max_cr_factor that does the job but I guess we can tweak a bit the algo to allows something like max_cr_factor = f(Z) . In li2012 it is technically possible I guess. The speed_up argument actually acts like a max_cr param. In dalponte2016 it is technically doable but harder because I think the algorithm must be redesigned much more. In the watershed it is possible by post processing the output.

But it is a lot of work because each algorithm need a specific and dedicated new code to achieve this task.

Paraboloid post-processing

You are suggesting to consider the work about parabols fitting to refine the output of any tree segmentation algorithm.

According to my (very) quick look at the paper and the figures it seems to be a refinement of the tree tops but I did not actually read the paper. Anyway I looked enought to tell you that it is definitively not a work a can reproduce.

Other comments

but in the case of shrubby areas within which is growing a taller individual, shrubs will be taken as part of the canopy till speed_up or max_cr distance. Am I wrong?

Honestly I don't know. In li2012 and watershed it is hard to say. li2012 has a parameter to control the minimum distance between two trees. In dalponde2016 and silva2016 what matters is the seeds i.e. the trees tops. If you have the good tree tops you have the trees.

BTW, if the package idea is to have a strict (as far as possible) implementation of published papers indipendently from possible improvements, then this discussion is closed too.

I can make few modifications if they are straightforward to understand. But I want the methods to be very well documented and it is not possible to do that in documentation. Only a published paper can allow a real understanding of the method. Moreover I don't want my name to be associated with wild algorithm never tested.

spono commented 4 years ago

it's a pity they're both out of reach...but it was worth of a try! at least now I know that, in case, they have to be tackled by specific development. As usual, thanks a lot for taking the time to evaluate such suggestions!

r-lidar / lidR