ropensci / refsplitr

R package for processing, organizing, and visualizing reference records downloaded from the Web of Science.
https://docs.ropensci.org/refsplitr
Other
55 stars 6 forks source link

UsingRefnet.Rmd merged_records doubled arguments #2

Closed aurielfournier closed 6 years ago

aurielfournier commented 6 years ago

I'm working through UsingRefnet.R (now .Rmd) and on line 199 in the .Rmd file is this chunk

output <- merge_records(
    references=ecuador_references, 
    authors=ecuador_authors, 
    authors__references=ecuador_authors__references, 
    references_merge=ecuador2_references, 
    authors_merge=ecuador2_authors, 
    authors__references_merge=ecuador2_authors__references,
    #references_merge=peru_references, 
    #authors_merge=peru_authors, 
    #authors__references_merge=peru_authors__references,
    #references_merge=eb_references, 
    #authors_merge=eb_authors, 
    #authors__references_merge=eb_authors__references,
    filename_root = "output/merged"
)

Originally the entire thing was uncommented, and it did not run, because the arguments are each called twice.

But when I comment out the second instance of each it works just fine.

I'm struggling to understand if there is a purposes in trying to rewrite the function so it can actually merge together all four datasets, or if its only suppose to do the two.

Can you provide some guidance?

embruna commented 6 years ago

Hmmm...can't think of why that was like that. I can think there are reasons why one might to combine different datasets PRIOR do disambiguating names (e.g., papers might be duplicated in different datasets, more efficient disambiguation) but not after. I think your deletion of the duplicated ones is the way to go.

aurielfournier commented 6 years ago

OK great thank you! I just wanted to check and make sure I was following things.