Closed TNRiley closed 1 year ago
took the dimensions .ris and imported into endnote, exported and reloaded them into citesource, same issue. Thought that this might adjust the field names, no luck. Not able to see why these abbreviated fields are being created or why full-name fields are empty.
Worked on this a bit more and I think that it's related to the manual deduplication. When running the dedup_citaitons function with manual_dedup=FALSE, the numbers almost swap.
wiped my environment and ran both again. Both manual TRUE and FALSE created accurate results with the majority of the DIM citations as overlap. I believe that I must have had something in my environment that was creating the error. My best guess is that this is related to the original Dimensions RIS that I had uploaded. This will need to be reviewed so I'm keeping this open for now.
verified that this issue is related to #96 - both the raw dimensions and the raw psycinfo .ris (relative to each issue) was the problem. A new issue should be created so that a check on .ris is performed. This issue was solved by importing problematic .ris files into endnote and then exporting them as a new .ris - EndNote must account for this issue and export in a standardized format.
There are two-letter metadata columns that are not aligning with fully named fields (do/DOI , JO/Journal)
Emailed a quick video on chasing a deduplication problem down to a potential differnece in metadata.
To look at the unique DIM items from the spreadsheet I reviewed the unique DIM records and varified that they were not unique by searching a endnote library that contained all the citations from each database. The looked a the dedup results data
dedup_results_unique<-(dedup_results$unique)
The example I provided was 10006. You can see how the variations of the field names are potentially causing the issue.