galv / lingvo-copy

Apache License 2.0
4 stars 0 forks source link

Build a flow to get the creators of CC-BY content #15

Open galv opened 3 years ago

galv commented 3 years ago

Look at the "license" or "licenseurl" of all of our original items. If it is one of "CC-BY {1,2,3,4}", then we are obligated to provide credit to the original creators of the work in order to comply with the CC-BY license. Almost all of sources have an "uploader" or "creator" ID in their schema.

We need to output a SQL table that looks like the following:

CREATE TABLE credits(
author_or_uploader TEXT,
name_of_work TEXT,
source INT, -- foreign key into a "sources" table. That would be "librivox", "archive.org", "vimeo", etc.
original_license INT -- this could be a foreign key into a "licenses" table, or we could simply denormalize and list the TEXT of the license in this case. It doesn't matter. Note that CC-BY 1.0, CC-BY 2.0, CC-BY 3.0, and CC-BY 4.0 are all legally distinct licenses
);

We could then simply dump this table as a csv file in our data distribution to downloaders of the data.