mjwestgate / revtools

Tools to support research synthesis in R
https://revtools.net
48 stars 26 forks source link

How to view duplicates from Revtools #31

Open RBrady1997 opened 4 years ago

RBrady1997 commented 4 years ago

Hi Martin,

I am a fairly new R user and came across your package and am interested to try it out. Thank you for developing it.

I have managed to get the following code to work with one of your example data sets: library(revtools)

data <- read_bibliography("restoration_scopus.ris")

matches <- find_duplicates(data, match_variable = "title").

In the environment section it shows there is matches but I am unsure how to view these. Do I need to add more code?

Thank you for your assistance in advance Ruth

mjwestgate commented 4 years ago

Hi Ruth, Thanks for getting in touch, and for trying out revtools! I hope it's useful.

To answer your question; find_duplicates() calculates duplicates, but doesn't show them to you. If you want to view them, I'd suggest making a new column in data called matches and passing the whole thing to screen_duplicates to decide which duplicates are correct. The full code for that would be:

data <- read_bibliography("restoration_scopus.ris") matches <- find_duplicates(data, match_variable = "title") data$matches <- matches # makes a new column new_data <- screen_duplicates(data)

Note that once you are done, all your changes will be visible in the new object new_data.

The alternative is to just assume that find_duplicates has got everything right and just extract the unique entries from your data.frame, as follows:

new_data <- extract_unique_references(data, matches)

That's riskier but also faster :)

I hope this helps! Let me know if you have more questions.

Martin