creativecommons / quantifying

quantify the size and diversity of the commons--the collection of works that are openly licensed or in the public domain
MIT License
22 stars 30 forks source link

[Feature] Track use or "people served" #45

Open TimidRobot opened 1 year ago

TimidRobot commented 1 year ago

Description

It would be great to track use or "people served"

Questions:

(This feature is blocked by #22)

Paulooh007 commented 1 year ago

This will be really helpful in assessing impact or providing valuable information for marketing and promotion efforts. Although it might be more challenging, cuz not all licensed works are hosted on platforms that provide APIs, and not all APIs will provide usage or download statistics for a particular work.

For Flickr, using the API we can access the view count for a photo, But technically, anyone who has viewed one of your photos has "downloaded" it into their browser, since a browser must download the image to display it. So the question is, does the view count really count as download counts? The current Flickr data we have recorded the view count already.

The internet archive API also provides a way of getting metadata about an item using the search_items() method, The metadata can include title, collection, downloads etc I was able to create a simple script that searches for items with a CC BY-NC-SA 4.0 license and return their download counts, see gist We may have to get download count (or a cumulative sum) of each item for every licence type

Also, the Wikimedia API's imageinfo module has a globalusage property that gives information about a file's usage across Wikimedia projects, including the number of pages that embed the file, the titles of those pages, and the global usage count. This feature is helpful for tracking how widely a particular file is used across Wikimedia projects. I’m trying to see how to make this work for our use case.

Though it has it's limitations, the globalusage property may not be available for all files on Wikimedia Commons. In some cases, the API response may not include a globalusage array, even if the file is used on multiple pages across Wikimedia projects. Additionally, the globalusage property only provides information about where a file is used across Wikimedia projects; it does not provide any information about usage outside of Wikimedia projects.