pbs-assess / rosettafish

:fishing_pole_and_fish: An R package for translating fish- and fisheries-related terms
2 stars 14 forks source link

Create capitalization vector argument #15

Closed cgrandin closed 4 years ago

cgrandin commented 5 years ago

"l" - lower for all letters in word "u" - upper for all letters in word "t" - upper for first letter in word

grinnellm commented 4 years ago

Just want to add that this would be nice as I just tried to add "year" to the translation file, but got an error re duplicate terms ("Year" and "year"). I made a work around by wrapping the call to en2fr in tolower but it would be nice to be able to specify the case. tolower(en2fr("Year", translate))

seananderson commented 4 years ago

This wouldn't be too hard, but I'm not sure what the interface should be. We want it to work with individual words or vectors. How about a case argument?

en2fr("Year", TRUE, case = "upper")
en2fr("Year", TRUE, case = "lower")
en2fr("Year", TRUE, case = "titlecase")

Looks like we could use tools::toTitleCase() for titlecase, which is built into R, which, amazingly, I just discovered right now.

I assume we want to leave it so that the default in the dictionary is the first letter capitalized.

I can't imagine a scenario where you would want different cases for different elements of a vector, so maybe it's fine to apply the same case to all of them.

grinnellm commented 4 years ago

That sounds good to me. And leaving the dictionary default as "sentence case" makes sense too. I guess in that odd scenario where one wants different cases for different elements of a vector, perhaps one could just paste together the translated output from multiple calls to en2fr with different case options.

grinnellm commented 4 years ago

And in case we want it, stringr has case options.

seananderson commented 4 years ago

Thanks, I used that. Options are case = c("none", "sentence", "lower", "upper", "title"). "none" uses the value in the built-in dictionary verbatim and is the default.

Unit test examples: https://github.com/pbs-assess/rosettafish/blob/7343004a9e87a0f3ad0cd80e715f1c53e102b4ad/tests/testthat/test-trans.R#L121-L130

Eventually we might want to add {}, as in BibTeX, to protect capitalization (and lowercase) in the dictionary.

grinnellm commented 4 years ago

Thanks for this fix Sean! I still have a related problem; I'm not sure if this should be addressed here or in a new issue. I have a graph label that I want lower case, and able to translate. This works as expected:

> en2fr( "Year", translate=TRUE, case="lower" )
[1] "année"

But this does not (upper case "Y" remains):

> en2fr( "Year", translate=FALSE, case="lower" )
[1] "Year"

And this doesn't work:

> en2fr( "year", translate=TRUE, case="lower" )
Error: The following term is not in the translation database: year

I can start a new issue if you would like, or we can re-open this one.

seananderson commented 4 years ago

Hmm, yeah, I didn't think about the instance where you want the case to apply to the non-translated term. Should be fixed now.

Also, the order of the arguments should let you keep it shorter if you want:

en2fr("Year", TRUE, "lower")
grinnellm commented 4 years ago

Great, thanks again! I know, but I like to be specific..:)