statonlab / hardwoods_site

Hardwoods Genomics bugs, data loading, and general issues
GNU General Public License v3.0
2 stars 1 forks source link

Redbud (Cercis gigantea) transcriptome #481

Closed mestato closed 5 years ago

mestato commented 5 years ago

Publication and Data Information

https://www.ingentaconnect.com/contentone/ben/cbio/2016/00000011/00000001/art00007

Doesnt' look like we have access to the paper, but I see RNA data in NCBI.

Additional Information

Checklist

See New Genome Documentation for detailed instructions.

RaymondS1 commented 5 years ago

Fastqc

#PBS -N 1_fastqc
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -l nodes=1
#PBS -l walltime=00:10:00

cd $PBS_O_WORKDIR

module load fastqc

for f in /lustre/haven/gamma/staton/projects/undergrads/cercis_canadensis/rawreads/*.fastq
do
        filename=$(basename "$f")
        base="${filename%%.fastq*}"
        echo "filename $filename base $base"
        mkdir $base.fastqc

        fastqc -o $base.fastqc $f >& $base.fastqc.out
done

wait
RaymondS1 commented 5 years ago

Trimming

#PBS -N 2_trimming
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -l nodes=1:ppn=2
#PBS -l walltime=00:30:00

cd $PBS_O_WORKDIR

module load java

for F in /lustre/haven/gamma/staton/projects/undergrads/cercis_canadensis/rawreads/*.fastq
do
        BASE=$( basename $F | sed 's/.fastq*//g')
        echo "F $F"
        echo "base $BASE"

        java -jar /lustre/haven/gamma/staton/software/Trimmomatic-0.36/trimmoma$ 
RaymondS1 commented 5 years ago

Fastqc_trimmed

#PBS -N fastqc_trimmed
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -l nodes=1
#PBS -l walltime=00:10:00

cd $PBS_O_WORKDIR

module load fastqc

for f in /lustre/haven/gamma/staton/projects/undergrads/cercis_canadensis/2_trimming/*.fastq
do
        filename=$(basename "$f")
        base="${filename%%.fastq*}"
        echo "filename $filename base $base"
        mkdir $base.fastqc

        fastqc -o $base.fastqc $f >& $base.fastqc.out
done

wait
RaymondS1 commented 5 years ago

r_corrector

 perl /lustre/haven/gamma/staton/software/rcorrector/run_rcorrector.pl -s ../2_trimming/SRR957672_1.trim.fastq -od .

Trinity

#PBS -N redbud_trinity
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -l nodes=1:ppn=2
#PBS -l walltime=20:00:00

cd $PBS_O_WORKDIR

export PATH=$PATH:/lustre/haven/gamma/staton/software/bowtie2-2.3.2-legacy/

/lustre/haven/gamma/staton/software/Trinity-v2.4.0/Trinity --seqType fq --single /lustre/haven/gamma/staton/projects/undergrads/cercis_canadensis/3_rcorrector/SRR957672_1.trim.cor.fq --max_memory 20G
MattHuff commented 5 years ago

This data set has too many issues, let's close it for now.

RaymondS1 commented 5 years ago

There are many inconsistencies between the article, NCBI, and the fastq files