Open dfornika opened 5 years ago
Let's check that variables listed in USAGE statement match those actually parsed in job scripts.
Good idea. I've added it to the list above.
USAGE
statement variables in bwa_mem.sh script don't match those referenced later in script.
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/bwa_mem.sh#L11
case $key in
-1|--R1)
# input_R1.fastq.gz file
input_r1_fastq="$2"
shift # past argument
shift # past value
;;
-2|--R2)
# input_R2.fastq.gz file
input_r2_fastq="$2"
shift # past argument
shift # past value
;;
-r|--reference)
# reference fasta file
reference="$2"
shift # past argument
shift # past value
;;
-o|--output)
# output sam file
output="$2"
shift # past argument
shift # past value
;;
esac
done
I can change the USAGE
string variables.
In this case, the USAGE
statement contains the correct variables used in the script but these variables don't get initialized to "" (whereas an unused variable does).
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/fastqc.sh#L22
input_fastq=""
This variable not used in script.
These variables are used but aren't initialized:
input_r1_fastq="$2"
input_r2_fastq="$2"
I can fix this.
USAGE
statement includes all variables used in script. However, the alignment
variable is initialized to "", while the model
variable isn't.
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/iqtree.sh#L20
Do you want me to add model=""
to line 21?
USAGE
stmt has all the variables used within the script and 3 of 4 variables used in the script are initialized. I suspect treefile=""
at line 21 should be replaced with output_file=""
... do you agree?
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/maskrc-svg.sh#L20
label
--> would you like me to change this? https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/mlst.sh#L22# input contigs.fasta file
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/mlst.sh#L31input
variable was changed to assembly
:
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/mlst.sh#L32
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/mlst.sh#L55
...so I'll replace input=""
with assembly=""
.I noticed 2 issues with samtools_filter_fixmate_sort.sh:
integer
here. https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/samtools_filter_fixmate_sort.sh#L36# output sam (or bam) file
? https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/samtools_filter_fixmate_sort.sh#L42Inconsistency in variables initialized and those used:https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/samtools_fixmate.sh#L23
And the output
variable not initialized:https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/samtools_fixmate.sh#L38
I suggest changing initial variable declarations to:
input=""
output=""
...and deleting flags=0
(because the flags
variable isn't used)
...and changing https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/samtools_fixmate.sh#L37
to read
`# output sam (or bam) file
Two issues with this script:
-n|--name-order
to USAGE
stmt.https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/samtools_sort.sh#L11# output sam (or bam) file
? https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/samtools_sort.sh#L43queries
is initialized and declared but not reference otherwise in script:https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/seqtk_totalbp.sh#L24
I can delete it.
I noticed 4 issues with shovill.sh
:
-q|--queries
declared in USAGE
stmt but not used in script --> suggest removing it https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/shovill.sh#L11-o|--output_file
in in USAGE
stmt doesn't match below, where -o|--output_dir
is parsed https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/shovill.sh#L55-c|--mincov
used in script but not declared or initialized and not mentioned in USAGE
stmt. --> Should this be initialized to "" or some default value? https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/shovill.sh#L45-l|--minlen
used in script but not declared or initialized and not mentioned in USAGE
stmt. --> Should this be initialized to "" or some default value? https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/shovill.sh#L492 issues with snippy_ctgs.sh
:
ctgs=""
declaration to line 21 as this variable assigned later https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/snippy_ctgs.sh#L38
Note that snippy perl code sets default for ctgs to '' so this should work. USAGE
stmt needs to instruct the user to specify a contigs file on the command line
i.e. instead of USAGE="qsub $( basename "$BASH_SOURCE" ) [-h] -r|--ref REFERENCE_FASTA -c|--ctgs -o|--outdir OUTPUT_FILE
use this:
USAGE="qsub $( basename "$BASH_SOURCE" ) [-h] -r|--ref REFERENCE_FASTA -c|--ctgs CONTIGS_FASTA_FILE -o|--outdir OUTPUT_FILE
Only thing I noticed for snp-dists.sh is that comment states that the alignment file is a reference fasta
https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/snp-dists.sh#L29
I suggest changing this to read # alignment fasta
Suggestion for snp-sites.sh to change comment from https://github.com/Public-Health-Bioinformatics/cpo-pipeline/blob/62778005710b97c98d413331fa425f74bdc9e390/data/job_scripts/snp-sites.sh#L29
to read # alignment fasta
paths
dict.