dbpedia / extraction-framework

The software used to extract structured data from Wikipedia
860 stars 269 forks source link

Wrong extraction of {{lang|text}} clauses #747

Open s204451 opened 1 year ago

s204451 commented 1 year ago

When input is in the format of {{lang|text}} in Wikipedia, dbpedia extracts nothing from within the braces.

Example:

https://dbpedia.org/page/Abruzzo
https://en.wikipedia.org/wiki/Abruzzo

native_name on wikipedia:

{{lang|nap|Abbrùzzu}} / {{native name|nap|Abbrùzze}}

Output on dbpedia (dbp:nativeName):

/

Expected output:

Abbrùzzu / Abbrùzze


Another example is https://en.wikipedia.org/wiki/Tokyo/https://dbpedia.org/page/Tokyo where native_name = {{Nihongo2|東京都}} is not extracted because it is evaluated to nothing.