maxlath / wikibase-dump-filter

Filter and format a newline-delimited JSON stream of Wikibase entities
97 stars 15 forks source link

Error when using both language filter and omitting sitelinks #30

Closed shashank-agg closed 4 years ago

shashank-agg commented 4 years ago

Hi! Found a bug when trying to use both the --languages filter and omitting 'sitelinks'.

Reproduction: cat latest-all.json.bz2 | bzcat | wikibase-dump-filter --languages en,de --omit sitelinks > all.ndjson

Error: `/usr/local/lib/node_modules/wikibase-dump-filter/lib/keep_matching_sitelinks.js:6 Object.keys(sitelinks).forEach(sitelinkName => { ^

TypeError: Cannot convert undefined or null to object at Function.keys () at module.exports (/usr/local/lib/node_modules/wikibase-dump-filter/lib/keep_matching_sitelinks.js:6:10) at /usr/local/lib/node_modules/wikibase-dump-filter/lib/format_entity.js:21:26 at /usr/local/lib/node_modules/wikibase-dump-filter/lib/filter_format_and_serialize_entity.js:18:30 at Stream. (/usr/local/lib/node_modules/wikibase-dump-filter/lib/stream_utils.js:24:22) at Stream.stream.write (/usr/local/lib/node_modules/wikibase-dump-filter/node_modules/through/index.js:26:11) at Stream.ondata (internal/streams/legacy.js:19:31) at Stream.emit (events.js:315:20) at drain (/usr/local/lib/node_modules/wikibase-dump-filter/node_modules/through/index.js:36:16) at Stream.stream.queue.stream.push (/usr/local/lib/node_modules/wikibase-dump-filter/node_modules/through/index.js:45:5) `

The error is because we try to filter the sitelinks by language even if sitelinks field has been omitted by the user. I have raised a PR for the fix here: https://github.com/maxlath/wikibase-dump-filter/pull/29 Great library, btw :)

maxlath commented 4 years ago

fixed by #29 and published as v5.0.3 :tada: