Closed nakiamo closed 4 days ago
Hi, We have just published the new version 4.3.0 on CRAN. One of the main features concerns merging between databases. It is now possible to merge between all supported databases, and the metadata stored in the case of duplicate records follows a hierarchy that no longer depends on the order of the data frames passed to the mergeDbSources function.
The hierarchy is based on data quality and is as follows (from the highest to lowest priority): 1) WoS 2) Scopus 3) OpenAlex 4) Lens 5) Dimensions 6) PubMed 7) Cochrane
Your idea about deciding which TC value to choose is interesting. We will think about it.
Thank you for providing such a clear answer.
Hello. First of all, thanks a lot for this helpful package.
My question is about when merging WoS and Scopus databases, which TC (total citation) information is selected in the merged data if an article is in both WoS and Scopus data.
If I merge in this order (WoS first):
combined_wos_scopus <- mergeDbSources(wos_data, scopus_data, remove.duplicated = T)
and when I order the articles in a descending order based on their TC, this is the result:
If I merge in this order (Scopus first):
combined_wos_scopus_alternative <- mergeDbSources(scopus_data, wos_data, remove.duplicated = T)
this is the result:
So, depending on which database you put first while using mergeDbSources function the results change. It prioritize the first database's TC results while merging.
This is not unexpected or wrong but the analysis results (e.g. analysis result of most cited articles) change based on which TC information is used.
Apart from these two options, another method would be to manually choose the highest CT value, if an article is indexed in both indexes.
I think the researcher can choose one of these tree alternatives (prioritize WoS, prioritize Scopus, prioritize the highest TC value ) if an article is indexed in both WoS and Scopus. and I'm not sure which one of these alternatives is ideal.
I thought it would be nice if we could have the option to choose the merging method (deciding which TC value to choose) in mergeDbSources function.
I'm new to this package and I may have overlooked something that's already there, and I'm sorry if that's the case.