epigen-UCSD / epigen_ucsd_django

1 stars 1 forks source link

singleCell app's coolAdmin port-in #311

Open biomystery opened 4 years ago

biomystery commented 4 years ago

script prepare and be able to launch processing with standard parameters.

Parameter specification

Portals status monitoring

opoirion commented 4 years ago
VERSION="v4" GENOMETYPE="mm10" OUTPUTNAME="MM_147" INPUTNAME="MM_147" DATASETNAME="MM_147" bash ~/data/ps-epigen_job/LIMS/10x_model.bash

or with qsub using the command

qsub 
 -v VERSION="v4",GENOMETYPE="mm10",OUTPUTNAME="MM_147",INPUTNAME="MM_147",DATASETNAME="MM_147"  ~/data/ps-epigen_job/LIMS/10x_model.bash
cat  /projects/ps-epigen/datasets/opoirion/output_LIMS/MM_147/repl1//repl1_MM_147_final_logs.json 
{
  "logs_file": "/projects/ps-epigen/datasets/opoirion/output_LIMS/MM_147/repl1//repl1_MM_147_pipeline.log",
  "parameters": "/projects/ps-epigen/datasets/opoirion/output_LIMS/MM_147/repl1//pipeline_params.log",
  "report_address": "http://ns104190.ip-147-135-44.us:8088/dataset_report?dataset_name=MM_147&output_folder_name=output_LIMS&token=4bd4a5eea3609cab5994eda21ae4f7b5",                                                                                                                    
  "statistics": "/projects/ps-epigen/datasets/opoirion/output_LIMS/MM_147/repl1//repl1_MM_147_project_statistics.json",
  "success": true
}
#PBS -q condo
#PBS -N ${OUTPUTNAME}
#PBS -l nodes=1:ppn=8,mem=${MEM}gb
#PBS -l walltime=8:00:00
#PBS -o /home/opoirion/data/logs/${OUTPUTNAME}.out
#PBS -e /home/opoirion/data/logs/${OUTPUTNAME}.err
#PBS -V
#PBS -M opoirion@hawaii.edu
#PBS -m abe
#PBS -A epigen-group

module load bowtie2
module load bwa
module load bedtools
module load samtools

#!/usr/bin/bash
# Avoid the use of the symbol '~' as a reference to the home directory

case ${GENOMETYPE} in

    "hg19")
    PROMOTER="/home/opoirion/data/ref_genomes/human/male.hg19/male.hg19_all_genes_refseq_TSS_promoter_2000.bed"
    ;;

    "hg38")
    PROMOTER="/home/opoirion/data/ref_genomes/human/hg38/hg38_all_genes_refseq_TSS_promoter_2000.bed"
    ;;

    "mm10")
    PROMOTER="/home/opoirion/data/ref_genomes/mouse/mm10/mm10_all_genes_refseq_TSS_promoter_2000.bed"
    ;;

    *)
    echo "Wrong genome type given by GENOMETYPE var: ${GENOMETYPE}"
    exit 1
    ;;

esac

case ${VERSION} in

    "v2")
    ;;

    "v4")
    ;;

    "snap")
    VERSION="v4"
    ;;

    "density")
    VERSION="v2"
    ;;

    *)
    VERSION="v4"
    ;;
esac

case ${COMPUTETSS} in
    "true")
    COMPUTETSS="True"
    ;;
    "True")
    COMPUTETSS="True"
    ;;
    "TRUE")
    COMPUTETSS="True"
    ;;
    "T")
    COMPUTETSS="True"
    ;;
    *)
    COMPUTETSS="False"
    ;;
esac

REFBARCODELIST="/projects/ps-epigen/outputs/10xATAC/${OUTPUTNAME}/outs/singlecell.csv"
BEDFILE="/projects/ps-epigen/outputs/10xATAC/${OUTPUTNAME}/outs/fragments.tsv.gz"

echo "version: ${VERSION}"
echo "genome type: ${GENOMETYPE}"
echo "output name: ${OUTPUTNAME}"
echo "dataset name: ${DATASETNAME}"
echo "promoter file: ${PROMOTER}"
echo "ref barcode list: ${REFBARCODELIST}"
echo "bed file: ${BEDFILE}"

python2 ~/code/snATAC/snATAC_pipeline/clustering_pipeline.py \
       -output_name ${OUTPUTNAME} \
       -output_path /projects/ps-epigen/datasets/opoirion/output_LIMS/${DATASETNAME} \
       -bed_file ${BEDFILE} \
       -ref_barcode_list ${REFBARCODELIST} \
       -refseq_promoter_file ${PROMOTER} \
       -threads_number 8 \
       -format_output_for_webinterface True \
       -sambamba /home/oliver/prog/sambamba-0.6.8-linux-static \
       -rm_original_bed_file True \
       -workflow_version ${VERSION} \
       -snap_bin_size 5000 1000 \
       -snap_use_peak True \
       -compute_TSS_enrichment ${COMPUTETSS} \
       -bam_bigwig_for_top_clustering True \
       -perform_chromVAR_analysis False \
       -perform_cicero_analysis False \
       -is_10x True \
       -min_number_of_reads_per_cell 0 \
       -fraction_of_reads_in_peak 0.0 \
       -path_to_remote_server "opoirion@ns104190.ip-147-135-44.us:data/data_ALL/output_LIMS" \
opoirion commented 4 years ago

Link toward the gitlab folder containing CoolAdmin scripts related to the LIMS system: https://gitlab.com/Grouumf/ps-epigen_job/tree/master/LIMS

opoirion commented 4 years ago

The README docs and the function to merge multiple 10x projects are now available !

See the documentation here: https://gitlab.com/Grouumf/ps-epigen_job/tree/master/LIMS

opoirion commented 4 years ago

The documentation is updated describing the general parameter workflow that can be used with the script to launch the single-cell ATAC-Seq pipeline: https://gitlab.com/Grouumf/ps-epigen_job/tree/master/LIMS

brandonGonzalez01 commented 4 years ago

What happens to the report_address and parameters in the output final_logs.json file when a new job is submitted with new parameters?

I currently have a new entry in the database for every job submission to show past job submissions to the user, should I be saving every job submitted of each sequence or is every new job submitted with the same sequence going to be overwritten and so I do not need to save every past job submitted in the database?