rzldasb / learning_R.github.io

0 stars 0 forks source link

Regular Expression #6

Open rzldasb opened 5 years ago

rzldasb commented 5 years ago
函数 Function
nchar 字符的个数
toupper 转换为大写
tolower 转换为小写
subset 求字符串的字符
grep 基于正则表达式的匹配
sub 基于正则表达式的替换
strsplit 字符串分割
paste 字符串连接
match 匹配元素位置组成的向量

正则表达式的简介

  1. 常用字符
字符 意思
. 任意字符
^ 开头
$ 结尾
[a-z] a-z任意一个
{2} 重复两次
\d 数字0-9
\D 非数字
\w 表示字,数字和文字
\s 表示空格

元字符

[ ] \ ^ $ . | ? * + ( )

rzldasb commented 5 years ago

字符查询

grep: 返回vector grepl: 返回逻辑值

files <- list.files("c:/windows") 
grep("\\.exe$", files)    以.exe结尾的文件名的vector值
 [1]   8  28  30  35  36  57  68  98  99 101 110 111 114 116 
grepl("\\.exe$", files)    逻辑判断是否是以.exe结尾
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE 
 [14] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 

实际用起来都一样

> files[grep("\\.exe$", files)] 
 [1] "bfsvc.exe"      "explorer.exe"   "fveupdate.exe"  "HelpPane.exe"   
 [5] "hh.exe"         "notepad.exe"    "regedit.exe"    "twunk_16.exe"   
 [9] "twunk_32.exe"   "uninst.exe"     "winhelp.exe"    "winhlp32.exe"   
[13] "write.exe"      "xinstaller.exe" 
> files[grepl("\\.exe$", files)] 
 [1] "bfsvc.exe"      "explorer.exe"   "fveupdate.exe"  "HelpPane.exe"   
 [5] "hh.exe"         "notepad.exe"    "regedit.exe"    "twunk_16.exe"   
 [9] "twunk_32.exe"   "uninst.exe"     "winhelp.exe"    "winhlp32.exe"   
[13] "write.exe"      "xinstaller.exe" 
rzldasb commented 5 years ago

文字替换

> text 
[1] "Hello Adam!\nHello Ava!" 
> sub(pattern="Adam", replacement="world", text) ##sub
[1] "Hello world!\nHello Ava!" 
> text 
[1] "Hello Adam!\nHello Ava!" 
rzldasb commented 5 years ago

字符串提取

substr

> x <- "123456789" 
> substr(x, c(2,4), c(4,5,8)) 
[1] "234" 
> substring(x, c(2,4), c(4,5,8)) 
[1] "234"     "45"      "2345678"