a0981906660 / Fortune500_SDG_Analysis

This repo is an example for Helen
1 stars 1 forks source link

01_keyword code 解釋 #3

Open Helen-CC opened 1 year ago

Helen-CC commented 1 year ago

18~21行 df_keyword %>% select(word2, sdg) %>% drop_na() %>% rename(word = word2)

24行 mutate(SDG_order = str_extract(sdg, pattern = "\\d+"),

29~30行 確認# trim white spaces at both sides mutate(word = str_trim(word, side = "both"))

31~34行 # add regular expressions mutate(word = str_replace_all(word, "\\*", ".?")) %>% mutate(word = str_replace_all(word, " AND ", ".*?")) %>% #舉例 Economic Resource AND Access 在一個句子裡面同時出現,不一定要前後 mutate(word = str_split(word, "; ")) %>% #把excel 裡面同一格有 分號; 的分開到不同row 如row 101

39&44 行

Create a dataframe of keywords without spaces -> nspace

Create a dataframe of keywords with spaces

50行跑完後environment

66~71行

Load the manual edited keyword mapping

https://docs.google.com/spreadsheets/d/1fZdE9WcFYI_d_sD4BgBpI5D1QhOYlRngsuSEtB7w694/edit#gid=470532546

df_manual <- read_excel("./data/raw_data/manual_edit_keywords.xlsx", sheet = "df_manual") h <- hash(keys = df_manual$word, values = df_manual$word_new)

71~79行

boyiechen commented 1 year ago

Markdown example

This the markdown 101 lecture.

eamples

# is the first level header

this is 3rd level header

boyiechen commented 1 year ago

Another example to point out the lines you want to know the details

https://github.com/a0981906660/Fortune500_SDG_Analysis/blob/main/code/01_1_keyword.R#L5-L20

boyiechen commented 1 year ago

Markdown example

italic Bold face bold face option 2

*italic*
**Bold face**
__bold face option 2__
boyiechen commented 1 year ago

Reference

https://docs.python.org/3/library/re.html