OuhscBbmc / StatisticalComputing

OUHSC's SCUG (Statistical Computing Users Group)
MIT License
8 stars 3 forks source link

Regular expression problem #8

Closed Maleeha closed 4 years ago

Maleeha commented 4 years ago

I have the following regular expression to capture $, ,, . and digits after period.

^(\$)\d*(,)\d*(.)(\d*)

It is for the following character variable. I am giving one value here:

$34,000.00

I want to get rid of all the captured groups. What do i substitute it with in the following statement.

dplyr::mutate(
    income = as.numeric(sub(regex, "", annual_income))
  )

@wibeasley ... I am not able to tag anyone else.

Thanks!

wibeasley commented 4 years ago

how's this?

> source <- "$34,000.00"

> gsub("[$,\\.]", "", source)
[1] "3400000"
Maleeha commented 4 years ago

What if i want to match the two 0's after the period as well? @wibeasley

Maleeha commented 4 years ago

Okay this worked:

> source <- "$34,000.00"
> gsub("[$]|[,]|[\.][0-9]{2}", "", source)
[1] "34000"