pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
420 stars 129 forks source link

Using Web of Science as second source #190

Closed 1kastner closed 1 year ago

1kastner commented 3 years ago

Question? First of all, thank you very much for this project and your effort!

Can pybliometrics (in theory) use another database than Scopus? In some fields, e.g. both Scopus and Web of Science are queried. Do you know of such a sister project?

Moreover, filtering duplicates etc would require to introduce a meta-level - an API independet representation of a publication. Have you heard of such a project or you are maybe even part of it?

Amadest commented 3 years ago

Hi! I subscribe to the question. Often the question is to exclude publications that are both there and there and take only one of the databases. The journals that WoS indexes are also very important.

Michael-E-Rose commented 3 years ago

Hi guys! Yes indeed, there were ideas to include APIs of other bibliometric databases back then when we renamed this package. That's why everybody imports from pybliometrics.scopus, so that there might be pybliometrics.wos etc. in future. On top of my list though is access to Sciencedirect (for full texts of Elsevier journals).

I know of two API wrappers for the WoS: https://pypi.org/project/wos/ and https://pypi.org/project/pywos/. Haven't checked any of them though.

1kastner commented 3 years ago

Is there some kind of API philosophy pybliometrics.wos should follow? Would it be acceptable for you if one of the WoS API wrappers "just" gets a façade stored at pybliometrics.wos and the API wrapper is added as an optional dependency?

1kastner commented 3 years ago

In addition, summarizing the results from different databases could be of great interest. Here, my focus lies on publications (not authors or affiliations). The process of merging the results of different databases should be quite verbose, i.e. each record (publication) should show which entries came from which database and which record is a duplicate etc. Where such a code could be stored? pybliometrics.summary or similar? In addition, I would like to do some snowballing, i.e. check the cited publications and the publications that refer to the publication at hand in all supported databases. Would you accept some contributions in that direction?

Amadest commented 3 years ago

@Michael-E-Rose Hello! I am aware of these libraries, but it seems to me that they are already old and not supported by anyone. Clarivate are actively developing their products, including APIs. Compared to Scopus, they have cleaner data and a pleasant upload format. Maybe we can help somewhere and join the development of a new wrapper.

Michael-E-Rose commented 3 years ago

Hi guys! I appreciate your spirit very much :) I agree, having access to multiple sources for the same document or author would be a great service to the community! I suggest to have a call, writing in this thread isn't working very well for a larger project. Would you shoot me an email so that I have yours and we can coordinate a meeting?

Michael-E-Rose commented 1 year ago

Since there seems to be a working Python package with https://github.com/enricobacis/wos, there is really no reason to add WoS to pybliometrics. Instead, pybliometrics will venture into the other Elsevier APIs.