Open gzbib opened 4 years ago
You are working with logical statements to create groups.
condition.01 <- grepl( ... )
condition.02 <- grepl( ... )
Then you construct groups by defining inclusion criteria:
condition.01 & condition.02 # intersection
condition.01 | condition.02 # union
condition.01 & ! condition.02 # set difference
However, the problem is that sometimes 2 conditions satisfy the same question type, for example, a question that begins with "How" and ends with "?". How can I separate these types?
Note that the criteria you are describing would need to be mutually exclusive in order to completely separate the groups.
Since once case can meet both, you need to operationalize your definitions (how are you identifying a question, for example) and the deciding how to separate groups into sub-groups (regular question, question with a colon then and answer, etc.).
Thank you Sir @lecy , but do we have to separate the types in this way? or we just make sure it meets one condition?
I am trying to negate a character (?) in a grepl expression but am not finding the right syntax.
It's very hard to write compound regular expressions. You are better off writing clear regular expressions, then combining criteria using logical vectors:
x <- c( "How to achieve goals.", "How can I achieve my goals?" )
c1 <- grepl( "^How", x )
c2 <- grepl( "\\?", x )
c1
[1] TRUE TRUE
c2
[1] FALSE TRUE
c1 & ! c2
[1] TRUE FALSE
Hello Sir,
x<- vector of clean titles
c1 <- grepl( "^How", x ) c2 <- grepl( "\?", x )
It is like I get the logic but I am not translating it to the right code maybe ? For example, what I was trying to do is below:
if (c1 & !c2){
print how-titles
}
else{
print how-titles with no questions
}
The results I got especially for the second part of the if-else statement are not right. I kept on getting how questions with question marks.
You can use the logical vector directly as your group vector for some basic analysis:
d <-
structure(list(f = c("treat", "control", "treat", "control",
"treat", "control", "treat", "control", "treat", "control"),
y = c(15, 8, 21, 9, 17, 9, 13, 11, 12, 8)), class = "data.frame", row.names = c(NA,
-10L))
f y
1 treat 15
2 control 8
3 treat 21
4 control 9
5 treat 17
6 control 9
7 treat 13
8 control 11
9 treat 12
10 control 8
mean( y[ f == "treat" ] )
[1] 15.6
mean( y[ f == "control" ] )
[1] 9
# DPLYR VERSION
library( dplyr )
d %>%
group_by( f ) %>%
summarize( ave=mean(y) )
f ave
<chr> <dbl>
1 control 9
2 treat 15.6
You can also construct a distinct group from the logical vector:
group <- ifelse( logical.vector, "how to title", "regular title" )
group <- ifelse( grepl( ... ), "how to title", "regular title" )
Hello Sir @lecy ,
I was trying to categorize the titles such as the ones that start with "How", the ones that end with "?", and the ones that have ":" and others. However, the problem is that sometimes 2 conditions satisfy the same question type, for example, a question that begins with "How" and ends with "?".
How can I separate these types?
Thanks in advance.