Data-to-Insight-Center / patra-toolkit

Toolkit for semi-automated modelcard creation for AI/ML models.
BSD 3-Clause "New" or "Revised" License
3 stars 4 forks source link
fairness-ml json models python trustworthy-ai xai

Patra Model Card Toolkit

Documentation Status Build Status PyPI version

The Patra Toolkit is a component of the Patra ModelCards framework designed to simplify the process of creating and documenting AI/ML models. It provides a structured schema that guides users in providing essential information about their models, including details about the model's purpose, development process, and performance. The toolkit also includes features for semi-automating the capture of key information, such as fairness and explainability metrics, through integrated analysis tools. By reducing the manual effort involved in creating model cards, the Patra Toolkit encourages researchers and developers to adopt best practices for documenting their models, ultimately contributing to greater transparency and accountability in AI/ML development.

Features

The Patra Toolkit plays a crucial role in promoting transparency and accountability in AI/ML development by making it easier for developers to create comprehensive and informative Model Cards. By automating certain aspects of the documentation process and providing a structured schema, the Toolkit reduces the barriers to entry for creating high-quality model documentation.

For more information, please refer to the Patra ModelCards paper.

Getting Started

Installing Patra Model Card

The latest version can be installed from PyPI:

pip install patra-toolkit

For local installation, clone the repository and install using:

pip install -e <local_git_dir>/patra_toolkit

Usage

Create a Model Card

Find the descriptions of the Model Card parameters in the schema descriptions document.

from patra_toolkit import ModelCard

mc = ModelCard(
  name="UCI Adult Data Analysis model using Tensorflow",
  version="0.1",
  short_description="UCI Adult Data analysis using Tensorflow for demonstration of Patra Model Cards.",
  full_description="We have trained a ML model using the tensorflow framework to predict income for the UCI Adult Dataset. We leverage this data to run the Patra model cards to capture metadata about the model as well as fairness and explainability metrics.",
  keywords="uci adult, tensorflow, explainability, fairness, patra",
  author="Sachith Withana",
  input_type="Tabular",
  category="classification",
  foundational_model="None"
)

# Add Model Metadata
mc.input_data = 'https://archive.ics.uci.edu/dataset/2/adult'
mc.output_data = 'https://github.iu.edu/swithana/mcwork/rf_sklearn/adult_model.pkl'

Initialize an AI/ML Model

from patra_toolkit import AIModel

ai_model = AIModel(
  name="UCI Adult Random Forest model",
  version="0.1",
  description="Census classification problem using Random Forest",
  owner="Sachith Withana",
  location="https://github.iu.edu/swithana/mcwork/randomforest/adult_model.pkl",
  license="BSD-3 Clause",
  framework="sklearn",
  model_type="random_forest",
  test_accuracy=accuracy
)

# Populate Model Structure
ai_model.populate_model_structure(trained_model)
mc.ai_model = ai_model

# Add Custom Metrics
ai_model.add_metric("Test loss", loss)
ai_model.add_metric("Epochs", 100)
ai_model.add_metric("Batch Size", 32)
ai_model.add_metric("Optimizer", "Adam")
ai_model.add_metric("Learning Rate", 0.0001)
ai_model.add_metric("Input Shape", "(26048, 100)")

Run Fairness and Explainability Scanners

# To assess fairness, provide the sensitive feature, test data, labels, and predictions
mc.populate_bias(X_test, y_test, predictions, "gender", X_test['sex'], clf)

# To generate explainability metrics, specify the dataset, column names, model, and number of features
mc.populate_xai(X_test, x_columns, model, top_n=10)

Validate and Save the Model Card

# Verify the model card content against the schema
mc.validate()
mc.save(<file_path>)

# Capture Python package dependencies and versions
mc.populate_requirements()

# Upload the model card to the Patra server
mc.submit(<patra_server_url>)

Examples

Explore the following example notebooks and model cards to learn more about how to use the Patra Model Card Toolkit: Notebook Example, Model Card Example


License

The Patra Model Card toolkit is developed by Indiana University and distributed under the BSD 3-Clause License. See LICENSE.txt for more details.

Reference

S. Withana and B. Plale, "Patra ModelCards: AI/ML Accountability in the Edge-Cloud Continuum," 2024 IEEE 20th International Conference on e-Science (e-Science), Osaka, Japan, 2024, pp. 1-10, doi: 10.1109/e-Science62913.2024.10678710. Keywords: Analytical models, Vectors, Edge-cloud continuum, Model cards, AI/ML accountability, Provenance