massimoaria / bibliometrix

An R-tool for comprehensive science mapping analysis. A package for quantitative research in scientometrics and bibliometrics.
https://www.bibliometrix.org
Other
520 stars 150 forks source link

Possible issue regarding the summation of total citations per country #106

Closed CarlosML closed 4 years ago

CarlosML commented 4 years ago

My colleague @andreaolmos90 and I are using bibliometrix to perform bibliometric analysis, we have observed that when importing data from Scopus using convert2df() and then running summary() on the object returned by biblioAnalysis(), the result of "Total Citations per Country" seems to be inconsistent with, for example, "Corresponding Author's Countries" or "Country collaboration network". I have no experience with the R language, so this is my amateur attempt to debug the problem:

The object returned by biblioAnalysis() has a column 'totalCitation' which is a copy (biblioAnalysis.R @ 194) of the 'TC' column of the dataframe generated by convert2df(). When running summary(), a sum is calculated using all 'totalCitation' values grouped by country (summary.bibliometrix.R @ 178). However, we have noticed that sometimes the 'totalCitation' column has as value NA, and when R adds NA to an integer, the result is NA. Therefore, any country that in the object returned by biblioAnalysis() has any row with an NA value will have NA as the total number of citations. In convert2df.R, line 142 converts TC to numeric:

if ("TC" %in% names(M)){M$TC=as.numeric(M$TC)} else {M$TC <- 0} (in csvScopus2df.R, 'Cited by' is renamed to 'TC')

In the CSVs exported by Scopus, the column "Cited by" always seems to be present, but in some rows it has no value. So, presumably, if 'Cited by' is empty its coercion to numeric returns NA, thus ruining the sum of total citations.

massimoaria commented 4 years ago

I tried to solve the issue by modifying line 142 in convert2df.R. But I need an example file to check if the issue has been fixed.

CarlosML commented 4 years ago

This one has this issue with the country 'USA'.

massimoaria commented 4 years ago

I tried to upload and analyze your file, and summary function seems to work correctly now.