scholarly-python-package / scholarly

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
https://scholarly.readthedocs.io/
The Unlicense
1.4k stars 303 forks source link

"Inventors" in patents are not captured #246

Closed lianjiecao closed 3 years ago

lianjiecao commented 3 years ago

Hi, When using fill() to populate a publication and the publication is a patent, author field is missing from bib.

{'author_pub_id': 'Ql-RFR8AAAAJ:W7OEmFMy1HYC',
 'bib': {'abstract': 'Example implementations relate to assigning '
                     'microservices to cluster nodes. A sidecar proxy may be '
                     'deployed at a data plane of a distributed service. The '
                     'sidecar proxy may monitor telemetry data between '
                     'microservices of the distributed service. A '
                     'communication pattern may be determined from the '
                     'telemetry data of the distributed service. Each '
                     'microservice of the distributed service may be assigned '
                     'to a cluster node based on the communication pattern.',
         'pub_year': '2020',
         'title': 'Assignment of microservices'},
 'cites_per_year': {},
 'eprint_url': 'https://patentimages.storage.googleapis.com/f9/04/30/7ca5d8a0249f99/US10827020.pdf',
 'filled': True,
 'num_citations': 0,
 'pub_url': 'https://patents.google.com/patent/US10827020B1/en',
 'source': 'AUTHOR_PUBLICATION_ENTRY'}

I did a quick check. For a patent, Google Scholar uses "Inventors" rather than "Authors", which seems to be the problem.

scholarly-issue-tracking commented 3 years ago

Solution: Essentially check for key authors and/or inventors to fill the [bib][author] of a publication

Code:


search_query = scholarly.search_author('Lianjie Cao')
author = scholarly.fill(next(search_query))
author = scholarly.fill(author)
pub = scholarly.fill(author['publications'][10])
scholarly.pprint(pub)

Output:

{'author_pub_id': 'Ql-RFR8AAAAJ:YsMSGLbcyi4C',
 'bib': {'abstract': 'Network slicing enables Communication Service Providers '
                     'to partition physical infrastructure into '
                     'logically-independent networks. Network slices must be '
                     'provisioned to meet the Service-Level Objectives (SLOs) '
                     'of disparate offerings, such as enhanced Mobile '
                     'Broadband, Ultra Reliable Low Latency Communications, '
                     'and massive Machine Type Communications. Network '
                     'orchestrators must customize service placement and '
                     'scaling to achieve the SLO of each network slice. In '
                     'this article, we describe the challenges encountered by '
                     'network orchestrators in allocating resources to '
                     'disparate 5G network slices, and propose the use of '
                     'artificial intelligence to make core placement and '
                     'scaling decisions that meet the requirements of network '
                     'slices deployed on shared infrastructure. We explore how '
                     'artificial intelligence-driven scaling algorithms, '
                     'coupled with functionality-aware placement, can enable '
                     'providers to design …',
         'author': 'Amit Sheoran and Sonia Fahmy and Lianjie Cao and Puneet '
                   'Sharma',
         'journal': 'IEEE Internet Computing',
         'number': '01',
         'pages': '1-1',
         'pub_year': 2021,
         'publisher': 'IEEE Computer Society',
         'title': 'AI-Driven Provisioning in the 5G Core'},
 'cites_per_year': {},
 'filled': True,
 'num_citations': 0,
 'pub_url': 'https://www.computer.org/csdl/magazine/ic/5555/01/09345397/1qTYFFAyZbi',
 'source': 'AUTHOR_PUBLICATION_ENTRY'}