ropensci / refsplitr

R package for processing, organizing, and visualizing reference records downloaded from the Web of Science.
https://docs.ropensci.org/refsplitr
Other
55 stars 6 forks source link

ROpenSci Review - Parsing/Disambiguation #63

Closed aurielfournier closed 5 years ago

aurielfournier commented 5 years ago

Main functions have many lines, which makes it very hard to follow what is going on. It would be great, if these functions could be split into smaller units.

references_read() seems to contain a lot of repeated code to import data as data.frame. I wonder if the WoS csv export file format could be used instead of the Plain Text format? When the data is rectangular, the readr package has great functionalities to strip out whitespace, which takes much room in the function, and to define colClasses while loading files into R.

When importing data with references_read(), values in many columns end with a line break \n.

Some console messages are invoked by using the print() method (see https://github.com/embruna/refnet/search?l=R&q=print+%2A.R). To enable user-friendly surpression, message() and warning() can be used instead.

There are various issues when checking the code syntax with lintr::lint() that needs to adressed.

Documentation of functions can be improved by making more use of roxygen2 tags. Not all functions have examples. Internal functions should be tagged with @noRd to avoid that they are added to the manual.

birderboone commented 5 years ago

I checked into this and there is no way with us importing files the way do to do it differnetly, i did my best to clean it up. And then did the last couple of issues, no line breaks at end, print changed to message, lintr passes fine enough