We use the following tools and mechanisms for experiments:
The Digital Innovation Indefinite Delivery Indefinite Quantity (IDIQ) contract The Data Processing Plan documents data transformations and the predicted and actual AI model performance for specific tasks. It combines elements from a model card, data cover sheet and documents curatorial provenance. Vendors are required to fill it out as part of the Digital Innovation IDIQ. In Development: NLP vendor evaluation guide and quality review recommendations. Under Recommendation: Balanced datasets for benchmarking newly available AI models and tools.
Test specific use cases, models and data with staff and users to document performance and build quality baselines and benchmarks
Tools for use in this phase:
Title
Description
Last Revised
Download
Data Processing Plan
This template documents data transformations and the predicted and actual AI model performance for specific tasks. It combines elements from a model card, data cover sheet and documents curatorial provenance. Vendors are required to fill it out as part of the Digital Innovation IDIQ.
Digital Innovation Indefinite Delivery Indefinite Quantity (IDIQ)
The Library of Congress Digital Digital Innovation IDIQ contract is a multi-year contracting mechanism that we can use to fulfill individual AI experiment at the Library of Congress, and includes requirements that may be valuable to the broader community.
Experiment
We use the following tools and mechanisms for experiments:
The Digital Innovation Indefinite Delivery Indefinite Quantity (IDIQ) contract The Data Processing Plan documents data transformations and the predicted and actual AI model performance for specific tasks. It combines elements from a model card, data cover sheet and documents curatorial provenance. Vendors are required to fill it out as part of the Digital Innovation IDIQ. In Development: NLP vendor evaluation guide and quality review recommendations. Under Recommendation: Balanced datasets for benchmarking newly available AI models and tools.
Test specific use cases, models and data with staff and users to document performance and build quality baselines and benchmarks
Tools for use in this phase: