WDscholia / scholia

Wikidata-based scholarly profiles
https://scholia.toolforge.org
Other
219 stars 78 forks source link

Added new xpath to handle 'summary_title' class in ojs.py #2419

Closed faresh9 closed 8 months ago

faresh9 commented 8 months ago

Title

Fixes #2392 and #2366, better PR than #2418

Description

Updated the issue_url_to_paper_urls function to fix the issue where OJS scraper was not working for the provided URLs : (https://tidsskrift.dk/sygdomogsamfund/issue/view/10555) and (https://tidsskrift.dk/sygdomogsamfund/issue/view/3537) . The function now uses a more robust XPath expression to extract paper URLs from the issue webpage.

Caveats

No breaking changes introduced. No new dependencies added.

Testing

Tested the updated function with the provided URL for both issues (https://tidsskrift.dk/sygdomogsamfund/issue/view/10555) and (https://tidsskrift.dk/sygdomogsamfund/issue/view/3537) and verified that papers URLs are successfully extracted.

Checklist

fnielsen commented 8 months ago

There are two small styling issues:

scholia/scrape/ojs.py:96:80: E501 line too long (86 > 79 characters) scholia/scrape/ojs.py:105:1: E303 too many blank lines (3)

fnielsen commented 8 months ago

Thanks!