DS4PS / cpp-527-spr-2021

http://ds4ps.org/cpp-527-spr-2021/
0 stars 0 forks source link

Lab 4 Part 3 #13

Open AprilPeck opened 3 years ago

AprilPeck commented 3 years ago

After identifying individual statements, the lab instructions indicate we should create a separate group for each phrase, as in:

criteria.01 <- grepl( "immigrant rights", dat$mission ) 
criteria.02 <- grepl( "immigration", dat$mission ) 
criteria.03 <- grepl( "refugee", dat$mission ) 
criteria.04 <- grepl( "humanitarian", dat$mission ) 
criteria.05 <- ! grepl( "humanities", dat$mission )  # exclude humanities

Is there a reason to not just create one group using OR?

immigrant.group <- grepl( "immigrant rights|immigration|refugee|humanitarian", dat$mission )

lecy commented 3 years ago

The OR operator within a regular expression works very differently than the OR operator in a logical statement.

grepl( "immigrant rights|immigration", dat$mission )

The OR here applies to the letter on either side of the | operator, so it would identify either of the following:

immigrant rightsmmigration
# s in rights
immigrant rightimmigration
# or i in righti
lecy commented 3 years ago

You are building words with regular expressions, so the operators act on individual letters, if that makes sense.

One good example is searching for GRAY and GREY (the two variants). You could write the search as:

grepl( "gra|ey", strings )
AprilPeck commented 3 years ago

That makes sense. Thanks. Another question. We are looking for organizations that serve black communities. Does that include historical societies specific to slavery/black history? (Sorry if I'm overthinking this.) I have included them, but looking at the final product I'm questioning my decision.

lecy commented 3 years ago

Does that include historical societies specific to slavery/black history?

Yes, that's fine.

I mostly chose the example because it forces you to try to find words and phrases to disambiguate Black vs black and African American OR Pan-African vs African (e.g. African lions preservation).