ropensci / taxlist

Handling taxonomic lists
https://docs.ropensci.org/taxlist/
12 stars 4 forks source link

Expanding capabilities of dissect_name #6

Closed kamapu closed 4 years ago

kamapu commented 4 years ago

Through an example in Facebook I realized that in many cases, the function dissect_name() can be required to extract more than a single element as a single character value, as in this example.

# Data frame
Data <- data.frame( GenusSpecies=c("Acer platanoides - bla bla", " Acer  platanoides Miguel ", "Acer platanifolius ble"), n=1)

# Get rid of leading, trailing and double blanks
Data$GenusSpecies <- clean_strings(Data$GenusSpecies)

# This part can be done more efficient
Data$new_name <- apply(dissect_name(Data$GenusSpecies)[,1:2], 1, paste, collapse=" ")

# Statistics
Stats <- aggregate(n ~ new_name, Data, sum)

To simplify the third command, a new argument repaste will be implemented in this function.

kamapu commented 4 years ago

Enhancement implemented at https://github.com/ropensci/taxlist/commit/ae646e02be368a532535da198efd5c04a4daa054

kamapu commented 4 years ago

The cited task can be achieved with dissect_name() only:

# Data frame
Data <- data.frame( GenusSpecies=c("Acer platanoides - bla bla", " Acer  platanoides Miguel ", "Acer platanifolius ble"), n=1)

# Get rid of leading, trailing and double blanks
Data$GenusSpecies <- clean_strings(Data$GenusSpecies)

# New version
Data$new_name <- dissect_name(Data$GenusSpecies, repaste=c(1:2))

# Statistics
Stats <- aggregate(n ~ new_name, Data, sum)