jpuritz / dDocent

a bash pipeline for RAD sequencing
ddocent.com
MIT License
52 stars 41 forks source link

Patch GNU sort assumption out or remake_reference.sh #38

Closed outpaddling closed 6 years ago

outpaddling commented 6 years ago

Also set SHELL to bash as "parallel" does not work with some other shells and uses $SHELL to run child processes.

Lastly fix logic for choosing awk command. If "awk" is not GNU then we should use "gawk".

--- dDocent-test/Data/remake_reference.sh 2018-06-13 15:38:07.762081000 -0500 +++ remake_reference.sh 2018-06-13 15:30:42.031595000 -0500 @@ -1,16 +1,23 @@ export LC_ALL=en_US.UTF-8 +export SHELL=bash

if [[ -z "$5" ]]; then echo "Usage is sh remake_reference.sh K1 K2 similarity% Assembly_Type Number_of_Processors" exit 1 fi

-if ! awk --version | fgrep -v GNU &>/dev/null; then +if ! awk --version | fgrep GNU &>/dev/null; then awk=gawk else awk=awk fi

+if ! sort --version | fgrep GNU &>/dev/null; then

@@ -123,7 +130,7 @@ parallel --no-notice mawk -v x=$CUTOFF \''$1 >= x'\' ::: *.uniq.seqs | cut -f2 | perl -e 'while (<>) {chomp; $z{$_}++;} while(($k,$v) = each(%z)) {print "$v\t$k\n";}' | mawk -v x=$CUTOFF2 '$1 >= x' > uniq.k.$CUTOFF.c.$CUTOFF2.seqs fi

-sort -k1 -r -n --parallel=$NUMProc -S 2G uniq.k.$CUTOFF.c.$CUTOFF2.seqs |cut -f2 > totaluniqseq +$sort -k1 -r -n --parallel=$NUMProc -S 2G uniq.k.$CUTOFF.c.$CUTOFF2.seqs |cut -f2 > totaluniqseq mawk '{c= c + 1; print ">dDocentContig" c "\n" $1}' totaluniqseq > uniq.full.fasta LENGTH=$(mawk '!/>/' uniq.full.fasta | mawk '(NR==1||length<shortest){shortest=length} END {print shortest}') LENGTH=$(($LENGTH * 3 / 4)) @@ -149,6 +156,7 @@

    if [ -s "rbdiv.out.$1" ]; then
                rainbow merge -o rbasm.out.$1 -a -i rbdiv.out.$1 -r 2 -N10000 -R10000 -l 20 -f 0.75

@@ -159,21 +167,21 @@ sed -e 's/NNNNNNNNNN/ /g' uniq.fasta | cut -f1 > uniq.F.fasta CDHIT=$(python -c "print (max("$simC" - 0.1,0.8))") cd-hit-est -i uniq.F.fasta -o xxx -c $CDHIT -T $NUMProc -M 0 -g 1 -d 100 &>cdhit.log

jpuritz commented 6 years ago

Done in version 2.5.3. Patches aren't as helpful as pull requests for future issues.