WDscholia / scholia

Wikidata-based scholarly profiles
https://scholia.toolforge.org
Other
222 stars 79 forks source link

Paragraph breaks are removed, with no space substituted. #1501

Closed pigsonthewing closed 3 years ago

pigsonthewing commented 3 years ago

On https://scholia.toolforge.org/topic/Q936 I see text from the English Wikipedia, including "satellite navigation devices.Created by Steve Coast ".

On Scholia, there is no space between "devices" and "Created".

On the Wikipedia article there is a paragraph break between "devices" and "Created".

carlinmack commented 3 years ago

Unfortunately this seems to be a long-standing bug with the TextExtract extension which our API relies on https://phabricator.wikimedia.org/T201946

Daniel-Mietchen commented 3 years ago

We could in principle look into patching TextExtract upstream.

Daniel-Mietchen commented 3 years ago

An alternative option discussed in that Phabricator ticket would be to switch to the summary REST API: https://en.wikipedia.org/api/rest_v1/page/summary/OpenStreetMap

Screen Shot 2021-07-12 at 14 39 48
carlinmack commented 3 years ago

I using the REST API would be a good change as it also gets rid the pronunciation which can be quite distracting:

image