r-lidar / lidR

Airborne LiDAR data manipulation and visualisation for forestry application
https://CRAN.R-project.org/package=lidR
GNU General Public License v3.0
571 stars 130 forks source link

Renaming functions in lidR v3.0.0 #321

Closed Jean-Romain closed 4 years ago

Jean-Romain commented 4 years ago

Context

With discussed with Martin Isenburg about function names in lidR and LAStools. This is a discussion I wanted to have with him for a while. At the very beginning of the development of lidR I started to name the functions that return a LAS object lassomething(). I had no intention to overlap with Martin's software and anyway I have never planned lidR to be used by so many users. As lidR grew up, I kept going with the name convention but now lidR is used worldwide and my function names actually overlap Martin's ones at a point that it creates confusion for users which is problematic both for Martin and I. And it's gonna be worst because Martin will release new stuff for LAStools. It is my bad and I'm starting a renaming work that is likely to take 1 to 2 years to end.

What will change

Here the plan: every lassomething() function will be renamed. I still need to choose a new convention. The approximate schedule is:

Package maintainers will be contacted by email in due course anyway.

How it will change

I need to find a new name convention. I still want a 3 letters prefix because it is convenient for code autocompletion and search. And I want this prefix to tell the users what is the output of the function (as it currently is). Here my thought:

If you have any opinion, idea on the question please let me know.

Some functions will however remained prefixed with las_ if they makes sense only within the scope of the LAS specification such as las_rescale(), las_addextrabyte() and so on.

Renaming plan

Convention

Some functions have their own orphan convention such as track_sensor, voxelize_points, smooth_elevation, decimate_points.

Functions renamed

Old name New name
lasadddata add_attribute
lasaddextrabytes add_lasatttribute
lasaddextrabytes_manual add_lasatttribute_manual
lascheck las_check
lasclip clip_roi
lasclipCircle clip_circle
lasclipPolygon clip_polygon
lasclipRectangle clip_rectangle
lasclipTransect clip_transect
lasdetectshape segment_shapes
lasfilter filter_poi
lasfilterdecimate decimate_points
lasfilterduplicates filter_duplicates
lasfilterfirst filter_first
lasfilterfirstlast filter_last
lasfilterfirstofmany filter_firstofmany
lasfilterground filter_ground
lasfilterlast filter_last
lasfilternth filter_nth
lasfiltersingle filter_single
lasfiltersurfacepoints filter_surfacepoints
lasflightline retrieve_flightlines
lasground classify_ground
lasmergespatial merge_spatial
lasnormalize normalize_elevation
laspulse retrieve_pulse
lasrangecorrection normalize_intensity
lasremoveextrabytes remove_lasattribute
lasreoffset las_reoffsets
lasrescale las_rescale
lasscanline retrieve_scanline
lassmooth smooth_points
lassnags segment_snags
lastransform spTransform
lastrees segment_trees
lasunnormalize unnormalize_elevation
lasunsmooth unsmooth_elevation
lasvoxelize voxelize_points
sensor_tracking track_sensor
tree_detection find_trees
tree_hull delineate_trees
ptompalski commented 4 years ago

Hi @Jean-Romain

This is important news and important change to the package - I talked to Martin some time ago and he voiced his concerns as well. So I think it is a necessary change indeed.

I understand your point of view to have a nice prefix that will substitute "las". But did you condisder not having prefix at all? Many functions like eg lasvoxelize() or lastrees() would become just voxelize() or trees() (or perfhaps detect_trees()), which in my opinion would be perfectly fine. I know that some functions e.g. lasfilter() if changed to filter() would have lots of conflict with other packages (raster, dplyr), but in such cases a prefix could be added or function name changed completely.

Just throwing some ideas to the discussion. The existing prefix (las) indicates that the function works with las objects. But isn't it the case for the majority of functions in lidR? If yes, if all (or most) of the functions work with point cloud data, then perhaps the prefix is not needed.

Piotr

Jean-Romain commented 4 years ago

But did you condisder not having prefix at all?

I did. Not my favorite because prefixing allows to put together (and in alphabetic order) all the functions that are similar. It is convenient for the documentation, convenient to find functions you do not know. Sometime I do not remember my own function. I type the prefix and I get all the relevant functions. I think that no prefix just create a lot of mess.

lasfilter() if changed to filter() would have lots of conflict with other packages (raster, dplyr)

Actually the problems started here. I didn't want to create filter so I created lasfilter and then everything started to get names las* because outputs were LAS objects. At the beginning it was 5 functions and 50 users... Now I have to move.

The existing prefix (las) indicates that the function works with las objects.

It indicates that it returns and object LAS (later extended to LAScatalog) and grid_* return rasters, and catalog_ return LAScatalog (and work only on LAScatalog)

DRAAlmeida commented 4 years ago

Thanks Jean, to let me know.

Best regards,

Blecigne commented 4 years ago

Hi Jean-Romain ,

Thanks for your updates, that's really kind to keep us informed.

Regarding the prefix, why not simply use ls for nonspecific functions ? So you would have als, tls and ls.

Jean-Romain commented 4 years ago

So you would have als, tls and ls_.

I will definitively consider that one :wink:. Moreover it allows me to reserve some uav_something() or dap_something(). Who knows. I like it.

tgoodbody commented 4 years ago

Hi JR,

Alot of suggestions related to prefix's... I think that if you start to add too many like als, tls, ls etc that things become complicated and confused for users. This is really a branding problem related to your package (that is why Martin contacted you in the first place).

If it were up to me i would keep the function names as simple as possible and relate them to your package so users know explicitly that they are coming from your package. If it were up to me I would get rid of the prefix entirely (I know you said it makes it easier to search for functions) but I don't think that's a particularly compelling argument if all the information on the functions are well documented and easily accessible (which they are).

I am discussing in the lab now with Piotr, and we believe that the combination of i.e. detect_trees() / classify_ground() are the most simple, direct, and effective naming conventions. As it stands now, your functions already have either nouns and verbs that would be necessary for this, reducing need for complete naming overhaul i.e lasfilterground. This makes the purpose of the function clear and unambiguous and also creates consistency in naming for the whole package.

If we want to be smart about the release of the future lidRbook project which we have been collaborating on we should time the changes in lidR with the release of the book to provide users with comprehensive understanding of the changes and what the functions have been changed to. But this is perhaps getting away from the issue at hand.

Would be willing to discuss and help you further if needed.

Jean-Romain commented 4 years ago

Your arguments are good. I updated the first post with possible names. I don't like them all and I think they are not all consistent but I guess we can improve that.

ptompalski commented 4 years ago

I like those! Yes - there is definitely room for improvement but the initial versions are already quite good. And consistent!

Jean-Romain commented 4 years ago

But it also means than other functions must be update for consistency with verb + noun convention such as:

Old name New name
sensor_tracking track_sensor
tree_detection detect_trees
tree_hull ???
ptompalski commented 4 years ago

maybe 'delineate_trees()`? or 'delineate_crowns()' ?

Jean-Romain commented 4 years ago

I've update the first post for something pretty satisfying

spono commented 4 years ago

I like those! Yes - there is definitely room for improvement but the initial versions are already quite good. And consistent!

yep, I do agree with Piotr: they look fine and pretty easy to get.

maybe delineate_trees()? or delineate_crowns()?

lastrees -> segment_trees (consistent with segment_snags) tree_hulls -> delineate_crowns lasvoxelise -> voxelise_points

HTH!

tgoodbody commented 4 years ago

Looks great, JR!

bi0m3trics commented 4 years ago

Somewhat late to the game here, but I agree and really like the use of "verbs" as the renaming convention. It's consistent with the overall direction of R as a language... A few thoughts -

Lastly, what about the catalog_ functions (or any others)? Would you be renaming them to have the consistent verb_noun format? For example, would catalog_apply become apply or apply_catalog or catalog_intersect become intersect or intersect_catalog? Again, would these be generic functions as to not conflict with existing base functions?

Jean-Romain commented 4 years ago

You are not late in the game. Nothing has been decided yet. I still dislike the verb_noun convention but it sounds to be objectively a good option and the preferred one.

clip will conflict with raster::clip among others

Another one I have to consider. Thanks. So we have transform, clip, filter that need a special attention.

I like detect_trees (consistent with detect_shapes)

No it is not consistent. detect_shapes updates a las object adding a new attributes while detect_trees returns a SpatialPointsDataFrame. I need to find something else. find_trees or classify_shapes. But classify_shapes is inconsistent with classify_ground.

detect_trees_location or something like this will leave room for future algorithm for things like dbh, height to live crown, etc.

Clever

Lastly, what about the catalog_ functions (or any others)?

I thought about that this morning in the bus.

Old name New name
catalog_apply catalog_apply
catalog_intersect intersect (inherited from raster)
catalog_retile retile_catalog
catalog_select select_tiles
Jean-Romain commented 4 years ago

I've edited the first post with something that starts to be consistent.

tgoodbody commented 4 years ago

I have the opinion that catalog can be interpreted as a verb given its uniqueness and importance within the package. Given that the catalog is fundamental to lidR it is a good branding move to maintain the word within the functions that utilize its functionality.

tgoodbody commented 4 years ago

I guess it could also be the noun, depending on your preference :)

jfbourdon commented 4 years ago

I also prefer the verb_noun combination as it's explicite and more in line with what we find in other packages.

No it is not consistent. detect_shapes updates a las object adding a new attributes while detect_trees returns a SpatialPointsDataFrame. I need to find something else. find_trees or classify_shapes. But classify_shapes is inconsistent with classify_ground

  • tree_detection -> extract_treetops as it's really the top that is found and it creates a new object
  • tree_hulls -> extract_crowns, extract_treehulls or delineate_crowns as suggested before and it creates a new object
  • detect_shape as-is because it only updates the las object with a new attribute
  • To me classify_ should be reserved to functions that edit the classification attribute like classify_ground

In lidRplugins, lake_detection could become extract_lake (or even extract_flats as it's what it does) and a new function could do the classify_water if more complex algorithm is build on top of the current lake_detection.

So extract_ could create a sp object, detect_ add a new attribute and classify_ edit the classification attribute.

I also think that the new names should minimize conflicts with sf, sp, raster and maybe dplyr as those packages are most likely to be used in combination with lidR in a workflow.

Jean-Romain commented 4 years ago

So extract_ could create a sp object, detect_ add a new attribute and classify_ edit the classification attribute.

It's more or less what I did with segment_, find_ and classify_ but I guess their is still room for small improvements.

I also think that the new names should minimize conflicts with sf, sp, raster and maybe dplyr.

And base and graphics and stats and all important package actually. But inheritance, when possible, is fine because it extend a function without being in conflict such as plot, print, extent, projection. Right now I only have a conflict with clip from graphic without possible inheritance. In need another name. Another reason that make me choose lasclip at the very beginning.

spono commented 4 years ago

So extract_ could create a sp object, detect_ add a new attribute and classify_ edit the classification attribute.

It's more or less what I did with segment_, find_ and classify_ but I guess their is still room for small improvements.

I also think that the new names should minimize conflicts with sf, sp, raster and maybe dplyr.

And base and graphics and stats and all important package actually. But inheritance, when possible, is fine because it extend a function without being in conflict such as plot, print, extent, projection. Right now I only have a conflict with clip from graphic without possible inheritance. In need another name. Another reason that make me choose lasclip at the very beginning.

maybe clip_points or clip_cloud? it allows for clip_xxx_circle, clip_xxx_etc

Jean-Romain commented 4 years ago

This is my final draft. Seems consistent to me.

Some functions have their own orphan convention such as track_sensor, voxelize_points, smooth_elevation, decimate_points.

jfbourdon commented 4 years ago

track_ could be the prefix of funtions that return spatial lines to give the centerline of something that is not dimensionless (roads, rivers, tracking of sensor, wires)

Jean-Romain commented 4 years ago

Yep but Spatial* are not 3D objects. I use SpatialPoints with an attribute Z as a workaround but it is not possible with SpatialLines. This class is thus not used for the sensor and the wires.

Jean-Romain commented 4 years ago

v3.0.0 available in the devel branch.

Thanks everybody for your help.

tgoodbody commented 4 years ago

Looks great, JR. Thanks for including us all in this. I think the changes will be really nice for users and has been an interesting learning experience for me.