Closed CarlosML closed 4 years ago
I tried to solve the issue by modifying line 142 in convert2df.R. But I need an example file to check if the issue has been fixed.
I tried to upload and analyze your file, and summary function seems to work correctly now.
My colleague @andreaolmos90 and I are using bibliometrix to perform bibliometric analysis, we have observed that when importing data from Scopus using
convert2df()
and then runningsummary()
on the object returned bybiblioAnalysis()
, the result of "Total Citations per Country" seems to be inconsistent with, for example, "Corresponding Author's Countries" or "Country collaboration network". I have no experience with the R language, so this is my amateur attempt to debug the problem:The object returned by
biblioAnalysis()
has a column 'totalCitation' which is a copy (biblioAnalysis.R @ 194) of the 'TC' column of the dataframe generated byconvert2df()
. When runningsummary()
, a sum is calculated using all 'totalCitation' values grouped by country (summary.bibliometrix.R @ 178). However, we have noticed that sometimes the 'totalCitation' column has as value NA, and when R adds NA to an integer, the result is NA. Therefore, any country that in the object returned bybiblioAnalysis()
has any row with an NA value will have NA as the total number of citations. In convert2df.R, line 142 converts TC to numeric:if ("TC" %in% names(M)){M$TC=as.numeric(M$TC)} else {M$TC <- 0}
(in csvScopus2df.R, 'Cited by' is renamed to 'TC')In the CSVs exported by Scopus, the column "Cited by" always seems to be present, but in some rows it has no value. So, presumably, if 'Cited by' is empty its coercion to numeric returns NA, thus ruining the sum of total citations.