creativecommons / quantifying

quantify the size and diversity of the commons--the collection of works that are openly licensed or in the public domain
MIT License
24 stars 34 forks source link

quantifying

Quantifying the Commons

Overview

This project seeks to quantify the size and diversity of the commons--the collection of works that are openly licensed or in the public domain.

Code of conduct

CODE_OF_CONDUCT.md:

The Creative Commons team is committed to fostering a welcoming community. This project and all other Creative Commons open source projects are governed by our Code of Conduct. Please report unacceptable behavior to conduct@creativecommons.org per our reporting guidelines.

Contributing

See CONTRIBUTING.md.

Project structure

Please note that in the directory tree below, all instances of fetch, process, and report are referring to the three phases of data gathering, processing, and report generation.

Quantifying/
├── .github/
│   ├── workflows/
│   │   ├── fetch.yml
│   │   ├── process.yml
│   │   ├── report.yml
│   │   └── static_analysis.yml
├── data/  # Data generated by script runs
│   ├── 20XXQX/
│   │   ├── 1-fetch/
│   │   ├── 2-process/
│   │   ├── 3-report/
│   │   │   └── README.md  # All generated reports are displayed in the README
│   └── ...
├── dev/
├── pre-automation/  # All Quantifying work prior to adding automation system
├── scripts/  # Run scripts for all phases
│   ├── 1-fetch/
│   ├── 2-process/
│   ├── 3-report/
│   └── shared.py
├── .cc-metadata.yml
├── .flake8  # Python tool configuration
├── .gitignore
├── .pre-commit-config.yaml  # Static analysis configuration
├── LICENSE
├── Pipfile  # Specifies the project's dependencies and Python version
├── Pipfile.lock
├── README.md
├── env.example
├── history.md
├── pyproject.toml  # Python tools configuration
└── sources.md

Development

Prerequisites

For information on learning and installing the prerequisite technologies for this project, please see Foundational technologies — Creative Commons Open Source.

This repository uses pipenv to manage the required Python modules:

  1. Install pipenv:
  2. Create the Python virtual environment and install prerequisites using pipenv:
    pipenv sync --dev

Running scripts that require client credentials

To successfully run scripts that require client credentials, you will need to follow these steps:

  1. Copy the contents of the env.example file in the script's directory to .env:
    cp env.example .env
  2. Uncomment the variables in the .env file and assign values as needed. See sources.md on how to get credentials:
    GCS_DEVELOPER_KEY = your_api_key
    GCS_CX = your_pse_id
  3. Save the changes to the .env file.
  4. You should now be able to run scripts that require client credentials without any issues.

Static analysis

Static analysis tools ensure the codebase adheres to consistent formatting and style guidelines, enhancing readability and maintainability. Also see GitHub Actions, below.

Using pre-commit

Pre-commit allows for static analysis tools (black, flake8, isort, etc.) to be run manually or with every commit:

  1. (Pre-commit is installed by completing Create the Python virtual environment and install prerequisites, above)
  2. Install or run manually
    • Install the git hook scripts to enable automatic execution on every commit
      pipenv run pre-commit install
    • Run manually using helper dev script:
      ./dev/check.sh [FILE]

      If no file(s) are specified, then it runs against all files:

      ./dev/check.sh
  3. (Optional) review the configuration file: .pre-commit-config.yaml

Resources

GitHub Actions

The .github/workflows/python_static_analysis.yml GitHub Actions workflow performs static analysis (black, flake8, and isort) on committed changes. The workflow is triggered automatically when you push changes to the main branch or open a pull request.

Data sources

Kindly visit the sources.md file for it.

History

For information on past efforts, see history.md.

Copying & license

Code

LICENSE: the code within this repository is licensed under the Expat/MIT license.

Data

CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
button

The data within this repository is dedicated to the public domain under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication.

Documentation

CC BY 4.0 license button

The documentation within the project is licensed under a Creative Commons Attribution 4.0 International License.