brawer / wikidata-qrank

Ranking signals for Wikidata
https://qrank.wmcloud.org
MIT License
66 stars 5 forks source link

Collect page sizes #38

Closed brawer closed 5 months ago

brawer commented 6 months ago

The page size in bytes would be an interesting ranking signal. In the Wikimedia SQL dumps, there is a page_len attribute in the page table. We could aggregate the counts across all Wikis (perhaps excluding Wikidata), and export the average (median? 90th percentile?) as a ranking signal.

brawer commented 5 months ago

We now compute this metric.