Closed cjcarlson closed 6 years ago
Thanks for spotting this. I've edited the findParasite
function to catch bracket symbols and hopefully parse the name correctly, but it's a bit hacky currently. It will always assume that the bracketed entry is in the middle, so it will do something similar to your approach of either pasting together the first and second elements of the split (if no brackets exist) or the first and third elements of the split (if brackets exist). I may fiddle around with this more.
Have you encountered any situations in the data where the brackets are elsewhere in the parasite name?
Have you encountered brackets in host names?
Are there instances where the parasite name is structured like "Text [text]" with no species name?
Note: the build of the updated package may fail. This is because of 502 errors (the NHM database may be down for a moment).
i've decided to actually start letting you know about these the right way so here goes!
Revisions included in Latin names lead to mis-parsed short names for example
"Cladorchis [Fischr.] watsoni (Conyngham, 1904)" parses as "Cladorchis [Fischr.]"
"Acanthogyrus (Acanthosentis) tilapiae (Baylis, 1948)" parases as "Acanthogyrus Acanthosentis"
I've been working on a function to address this:
revis <- function(name) {
list <- strsplit(name,' ')[[1]]
if(substr(list[2],1,1)=='(') { return(paste(list[1],list[3])) } else { if(substr(list[2],1,1)=='['){ return(paste(list[1],list[3])) } else { return(paste(list[1],list[2])) } } }