KateHyoung / UTDEventData

An R package to retrieve political event data from the UTD API server
GNU Lesser General Public License v3.0
15 stars 7 forks source link

Managing Duplicate Events #15

Open lsyaseen opened 5 years ago

lsyaseen commented 5 years ago

Hi There,

I am using returnDyad( ) to extract data for two actors, I am wondering if there is anyway to remove the duplicates post extraction?

I have duplicated data for several entries with different URLs or text_sources. I am attempting to do the cleaning myself but before that I was wondering if there is a way to extract "distinct events""?

KateHyoung commented 5 years ago

Thank you for your interests in our package.

Unfortunately, UTDEventData does not contain the function of extract "distinct event". You can, however, reduce the duplicated rows by the function of distinct() in dplyr. For example, you can obtain the new data set by newdata %>% distinct(text_cources or URL, .keep_all = TRUE) .

Please check the information on the following website. https://www.datanovia.com/en/lessons/identify-and-remove-duplicate-data-in-r/

Best, Kate

lsyaseen commented 5 years ago

Hey Kate thanks for the prompt reply! I will proceed by using the distinct function Thank you very much.