hwlim / BisKit-RNA

BisKit for CCHMC HPC users
0 stars 0 forks source link

Multiple revision needed #9

Closed cahn20 closed 5 months ago

cahn20 commented 5 months ago

Updated by Lim:

  1. There are multiple instances of using "echo" meaninglessly. In rules:
    • merge_align_stats_for_all_samples
    • merge_call_stats_for_all_samples

In script:

There may be more. Please thoroughly check this out and clean up

  1. rBis.getReadStats.sh complains about gzip In line 66:

unaligned=$( zcat $unalignedFQ | wc -l)

Is this correct?

  1. in the the same script The same command is invoked multiple times, which is a waste of time Please revise it. also see if there are other similar instances revise them if any

It can be avoided by running the repetitive part first, then reusing the results. totalReads=$(samtools view $bam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | wc -l) totalUniq=$(samtools view $bam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | cut -f1 | grep -w "1" | wc -l) totalMulti=$(samtools view $bam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | cut -f1 | grep -vw "1" | wc -l) totalPlusReads=$(samtools view $plusBam | cut -f1 | sort -S 1G | uniq | wc -l) plusUniq=$(samtools view $plusBam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | cut -f1 | grep -w "1" | wc -l) plusMulti=$(samtools view $plusBam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | cut -f1 | grep -vw "1" | wc -l) totalMinusReads=$(samtools view $minusBam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | wc -l) minusUniq=$(samtools view $minusBam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | cut -f1 | grep -w "1" | wc -l) minusMulti=$(samtools view $minusBam | cut -f1 | ${BISKIT_PATH}/Scripts/sortByFreq.sh | cut -f1 | grep -vw "1" | wc -l)

cahn20 commented 5 months ago

Addressed in pull request #10