maxlath / wikibase-dump-filter

Filter and format a newline-delimited JSON stream of Wikibase entities
97 stars 15 forks source link

Troubles with old wikidump #39

Closed danielshtel closed 2 years ago

danielshtel commented 2 years ago

Hi! I'm working with old wikidata dumps. Dump was downloaded from official source I'm trying to filtering through my unpacked dump with different parameters and received that error.

daniillevchenko@danil ~/w/P/RaiseWikibase (main) [SIGPIPE|1]> cat entities.json | wikibase-dump-filter --claim 'P31:Q5' > another_my_test.json
    parsed | total average parse time | recent average parse time |       kept | % of total |   last kept | last kept time | elapsed time
         0 |                      0ms |                       0ms |          0 |         0% |             |              0 |     00:00:00/usr/lib/node_modules/wikibase-dump-filter/lib/valid_claims.js:20
  let propClaims = claims[P]
                         ^

TypeError: Cannot read properties of undefined (reading 'P31')
    at /usr/lib/node_modules/wikibase-dump-filter/lib/valid_claims.js:20:26
    at arraySome (/usr/lib/node_modules/wikibase-dump-filter/node_modules/lodash.some/index.js:140:9)
    at some (/usr/lib/node_modules/wikibase-dump-filter/node_modules/lodash.some/index.js:1838:10)
    at /usr/lib/node_modules/wikibase-dump-filter/lib/valid_claims.js:14:10
    at arrayEvery (/usr/lib/node_modules/wikibase-dump-filter/node_modules/lodash.every/index.js:140:10)
    at every (/usr/lib/node_modules/wikibase-dump-filter/node_modules/lodash.every/index.js:1865:10)
    at module.exports (/usr/lib/node_modules/wikibase-dump-filter/lib/valid_claims.js:10:10)
    at /usr/lib/node_modules/wikibase-dump-filter/lib/filter_entity.js:13:10
    at /usr/lib/node_modules/wikibase-dump-filter/lib/filter_format_and_serialize_entity.js:16:9
    at Stream.<anonymous> (/usr/lib/node_modules/wikibase-dump-filter/lib/stream_utils.js:24:22)

Node.js v17.3.0

I tried this yesterday with latest wikidata-dump and it worked normally. But with old dump i have some troubles which i can not fix. Please give me direction to think where trouble is - with old dump or with my hands.

Thanks in advance for your reply!

maxlath commented 2 years ago

I wrote a patch for old dumps format in d5dc5c6, published in v5.0.7, should work now, could you confirm?

danielshtel commented 2 years ago

I wrote a patch for old dumps format in d5dc5c6, published in v5.0.7, should work now, could you confirm?

Wow, thank you so much for quick fix! Now it's working.