usnistgov / ccu_validation_scoring

Other
5 stars 0 forks source link

Computational Cultural Understanding (CCU) Evaluation Validation and Scoring Toolkit

Version: 1.3.4

Date: April 1, 2024

Table of Content

Overview

Setup

Directory Structure and File Format

Usage

Report a Bug

Authors

Licensing Statement

Overview

This package contains the tools to validate and score the TA1 evaluation tasks: ND (norm discovery), ED (emotion detection), VD (valence diarization), AD (arousal diarization), CD (change detection) and scoring tools for the Hidden Norms (NDMAP). Please refer to the CCU Evaluation Plan for more information about CCU, the evaluation tasks, and the file formats.

This README file describes the reference annotation validation tool, system output validation tool, scoring tool, reference statistics computing tool and random submission generation tool. OpenCCU evaluation uses a subset of these tools and has its own README.

Setup

The tools mentioned above are included as a Python package. They can be run under a shell terminal and have been confirmed to work under OS X and Ubuntu.

Prerequisites

Installation

Install the Python package using the following commands:

git clone https://github.com/usnistgov/ccu_validation_scoring

cd ./CCU_validation_scoring

python3 -m pip install -e ./

Directory Structure and File Format

The CCU validation and scoring toolkit expects input directories and/or files to have specific structures and formats. This section gives more information on these structures and formats that are referred in subsequent sections.

The reference directory mentioned validation and scoring sections must follow the LDC annotation data package directory structure and at a minimum must contain the following files in the given directory structure to pass validation:

<reference_directory>/
     ./data/
          norms.tab
          emotions.tab
          valence_arousal.tab
          changepoint.tab
     ./docs/
          segments.tab
          file_info.tab
     ./index_files/
          <DATASET>.system_input.index.tab

where <DATASET> is the name of dataset.

Please refer to the LDC CCU annotation data package README for the formats of the above .tab files.

The toolkit includes several sample reference datasets for testing. See ccu_validation_scoring/test/reference/LDC_reference_sample or other sibling directories.

The toolkit uses different index files for various purposes:

An example of a system input index file can be found in the sample reference datasets:

ccu_validation_scoring/test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.system_input.index.tab

An example of a system output index file can be found in the sample submissions:

ccu_validation_scoring/test/pass_submissions/pass_submissions_LDC_
reference_sample/ED/CCU_P1_TA1_ED_NIST_mini-eval1_20220816_050236/system_output.index.tab

Usage

In the CCU_validation_scoring-x.x.x/ directory, run the following to get the version and usage:

CCU_scoring version

CCU_scoring is an umbrella tool that has several subcommands, each with its own set of command line options. To get a list of subcommands, execute:

CCU_scoring -h

Use the -h flag on the subcommand to get the subcommand help manual. For example:

CCU_scoring score-nd -h

Reference Validation Subcommand

Validate a reference annotation directory to make sure the reference directory have the required files.

CCU_scoring validate-ref -ref <reference_directory>

Required Arguments

# an example of reference validation
CCU_scoring validate-ref -ref test/reference/LDC_reference_sample

Submission Validation Subcommands

Norm Discovery (ND) Validation Subcommand

Use the command below to validate an ND submission directory against a reference directory. The submission directory must include a system output index file.

CCU_scoring validate-nd -s <submission_directory> -ref <reference_directory>

Required Arguments

Optional Arguments

# an example of submission validation
CCU_scoring validate-nd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ND/CCU_P1_TA1_ND_NIST_mini-eval1_20220815_164235 \
-ref test/reference/LDC_reference_sample

Norm Discovery Mapping Validation Subcommand

Use the command below to validate the format of an NDMAP submission directory with a hidden norm list. This validation only applies to the mapping file, not the original system. The hidden norm list has one column (no header) containing norm IDs, one per row.

CCU_scoring validate-ndmap -s <submission_directory> -n <hidden_norm_list_file>

Required Arguments

# an example of ndmap submission validation
CCU_scoring validate-ndmap \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/NDMAP/CCU_P1_TA1_NDMAP_NIST_mini-eval1_20220605_050236 \
-n test/hidden_norms.txt 

Emotion Detection (ED) and Change Detection (CD) Validation Subcommands

Use the command below to validate an ED or CD submission directory against a reference directory. The submission directory must include a system output index file.

CCU_scoring validate-ed -s <submission_directory> -ref <reference_directory>
CU_scoring validate-cd -s <submission_directory> -ref <reference_directory>

Required Arguments

# an example of ed submission validation
CCU_scoring validate-ed \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ED/CCU_P1_TA1_ED_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample

# an example of cd submission validation
CCU_scoring validate-cd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/CD/CCU_P1_TA1_CD_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample

Valence Diarization (VD) and Arousal Diarization (AD) Validation Subcommands

Use the command below to validate an VD or AD submission directory against a reference directory. The submission directory must include a system output index file.

CCU_scoring validate-vd -s <submission_directory> -ref <reference_directory>
CU_scoring validate-ad -s <submission_directory> -ref <reference_directory>

Required Arguments

Optional Arguments

# an example of ed submission validation
CCU_scoring validate-vd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/VD/CCU_P1_TA1_VD_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample

# an example of cd submission validation
CCU_scoring validate-ad \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/AD/CCU_P1_TA1_AD_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample

Submission Scoring Subcommands

Norm Discovery (ND) Scoring Subcommand

Use the command below to score an ND submission directory against a reference directory with a scoring index file. The submission directory must include a system output index file.

CCU_scoring score-nd -s <norm_submission_directory> -ref <reference_directory> -i <scoring_index_file>

Norm Discovery Mapping Scoring Subcommand

Use the command below to score an NDMAP submission directory and an ND submission against a reference directory with a scoring index file. The submission directory must include a system output index file.

CCU_scoring score-nd -s <norm_submission_directory> -m <norm_mapping_submission_directory> -ref <reference_directory> -i <scoring_index_file>

Required Arguments

Optional Arguments

# an example of norm scoring
CCU_scoring score-nd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ND/CCU_P1_TA1_ND_NIST_mini-eval1_20220815_164235 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab

# an example of ndmap scoring
CCU_scoring score-nd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ND/CCU_P1_TA1_ND_NIST_mini-eval1_20220531_050236 \
-m test/pass_submissions/pass_submissions_LDC_reference_sample/NDMAP/CCU_P1_TA1_NDMAP_NIST_mini-eval1_20220605_050236 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab

Emotion Detection (ED) Scoring Subcommand

Use the command below to score an ED submission directory against a reference directory with a scoring index file. The submission directory must include a system output index file.

CCU_scoring score-ed -s <emotion_submission_directory> -ref <reference_directory> -i <scoring_index_file>

Required Arguments

Optional Arguments

# an example of ed scoring
CCU_scoring score-ed \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/ED/CCU_P1_TA1_ED_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ED.scoring.index.tab

Valence Diarization (VD) and Arousal Diarization (AD) Scoring Subcommands

Use the commands below to score an VD or AD submission directory against a reference directory with a scoring index file. The submission directory must include a system output index file.

CCU_scoring score-vd -s <valence_submission_directory> -ref <reference_directory> -i <scoring_index_file>
CCU_scoring score-ad -s <arousal_submission_directory> -ref <reference_directory> -i <scoring_index_file>

Required Arguments

Optional Arguments

# an example of vd scoring
CCU_scoring score-vd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/VD/CCU_P1_TA1_VD_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.VD.scoring.index.tab

# an example of ad scoring
CCU_scoring score-ad \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/AD/CCU_P1_TA1_AD_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.AD.scoring.index.tab

Change Detection (CD) Scoring Subcommand

Use the command below to score a CD submission directory against a reference directory with a scoring index file. The submission directory must include a system output index file.

CCU_scoring score-cd -s <change_submission_directory> -ref <reference_directory> -i <scoring_index_file>

Required Arguments

Optional Arguments

# an example of cd scoring
CCU_scoring score-cd \
-s test/pass_submissions/pass_submissions_LDC_reference_sample/CD/CCU_P1_TA1_CD_NIST_mini-eval1_20220531_050236 \
-ref test/reference/LDC_reference_sample \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.CD.scoring.index.tab 

Reference Statistics Computing Tool

The following command should be run within the CCU_validation_scoring-x.x.x/ directory.

python3 scripts/ccu_ref_analysis.py -r <reference_directory> -t <task_string> -i <scoring_index_file> -o <output_file>

Required Arguments

# an example of statistics computing
python3 scripts/ccu_ref_analysis.py -r test/reference/LDC_reference_sample \
-t norms \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-o tmp.tab

Random Submission Generation Tool

The following command should be run within the CCU_validation_scoring-x.x.x/ directory.

python3 scripts/generate_ccu_random_submission.py -ref <reference_directory> -t <task_string> -i <scoring_index_file> -o <output_directory>

Required Arguments

# an example of random submission generation
python3 scripts/generate_ccu_random_submission.py -ref test/reference/LDC_reference_sample \
-t norms \
-i test/reference/LDC_reference_sample/index_files/LC1-SimulatedMiniEvalP1.20220909.ND.scoring.index.tab \
-o tmp

Report a Bug

Please send bug reports to nist_ccu@nist.gov

For the bug report to be useful, please include the command line, files and text output, including the error message in your email.

Test case bug report

A test suite has been developed and is runnable using the following command within the CCU_validation_scoring-x.x.x/ directory:

This will run the tests against a set of submissions and reference files available under test.

pytest

Authors

Jennifer Yu <yan.yu@nist.gov>

Clyburn Cunningham <clyburn.cunningham@nist.gov>

Lukas Diduch <lukas.diduch@nist.gov>

Jonathan Fiscus <jonathan.fiscus@nist.gov>

Audrey Tong <audrey.tong@nist.gov>

Licensing Statement

Full details can be found at: http://nist.gov/data/license.cfm

NIST-developed software is provided by NIST as a public service. You may use,
copy, and distribute copies of the software in any medium, provided that you
keep intact this entire notice. You may improve, modify, and create derivative
works of the software or any portion of the software, and you may copy and
distribute such modifications or works. Modified works should carry a notice
stating that you changed the software and should note the date and nature of
any such change. Please explicitly acknowledge the National Institute of
Standards and Technology as the source of the software. 

NIST-developed software is expressly provided "AS IS." NIST MAKES NO WARRANTY
OF ANY KIND, EXPRESS, IMPLIED, IN FACT, OR ARISING BY OPERATION OF LAW,
INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, AND DATA ACCURACY. NIST NEITHER
REPRESENTS NOR WARRANTS THAT THE OPERATION OF THE SOFTWARE WILL BE
UNINTERRUPTED OR ERROR-FREE, OR THAT ANY DEFECTS WILL BE CORRECTED. NIST DOES
NOT WARRANT OR MAKE ANY REPRESENTATIONS REGARDING THE USE OF THE SOFTWARE OR
THE RESULTS THEREOF, INCLUDING BUT NOT LIMITED TO THE CORRECTNESS, ACCURACY,
RELIABILITY, OR USEFULNESS OF THE SOFTWARE.

You are solely responsible for determining the appropriateness of using and
distributing the software and you assume all risks associated with its use,
including but not limited to the risks and costs of program errors, compliance
with applicable laws, damage to or loss of data, programs or equipment, and the
unavailability or interruption of operation. This software is not intended to
be used in any situation where a failure could cause risk of injury or damage
to property. The software developed by NIST employees is not subject to
copyright protection within the United States.