sidbdri / cookiecutter-de_analysis_skeleton

Skeleton for new differential expression analysis project.
3 stars 1 forks source link

Automatically prepare deliverable results #209

Closed lweasel closed 1 year ago

lweasel commented 1 year ago

At the moment, when delivering results, I manually run a little script which copies the relevant bits of a results set into an new directory and tar/gzips it. But we could just do this automatically at the end of run_analysis.sh. We would no longer need the "-r" parameter in the script (see below), because it would always be being executed within a project, so would know where the results directory is.

The contents of the current script are:

#!/bin/bash

set -o errexit

USAGE="Usage: prepare_de_results -r <results-dir> -o <output-name>"

while getopts ":r:o:" opt; do
    case ${opt} in
        r )
            RESULTS_DIR=$OPTARG
            ;;
        o )
            OUTPUT_NAME=$OPTARG
            ;;
        \? ) echo $USAGE
            exit
            ;;
    esac
done

if [ -z "$RESULTS_DIR" ] || [ -z "$OUTPUT_NAME" ]; then
    echo $USAGE
    exit
fi

OUTPUT_DIR=~/tmp/$OUTPUT_NAME
mkdir $OUTPUT_DIR

cp $RESULTS_DIR/multiqc_report.html $OUTPUT_DIR
cp $RESULTS_DIR/sessionInfo.txt $OUTPUT_DIR
cp -r $RESULTS_DIR/differential_expression* $OUTPUT_DIR
cp -r $RESULTS_DIR/differential_expression_tx* $OUTPUT_DIR
cp -r $RESULTS_DIR/read_counts $OUTPUT_DIR

find $OUTPUT_DIR -name "*count*.csv" -exec rm {} \;
find $OUTPUT_DIR -name "*genes_in_sets*.csv" -exec rm {} \;

cd ~/tmp
tar cvfz ${OUTPUT_NAME}.tar.gz ${OUTPUT_NAME}
lweasel commented 1 year ago

The (rough) convention I used for the output file name is <date_of_run_in_YYYYMMDD_format>.<project_name>.results, so we could probably construct that automatically too.