statonlab / hardwoods_site

Hardwoods Genomics bugs, data loading, and general issues
GNU General Public License v3.0
2 stars 1 forks source link

Automated Annotations #554

Open MattHuff opened 4 years ago

MattHuff commented 4 years ago

Publication and Data Information

Additional Information

Automated annotations are located in /files/files/automated_annotation/. There will be a gene.fasta, mRNA.fasta, and polypeptide.fasta file within this directory.

Checklist

See New Genome Documentation for detailed instructions.

CaseyRichards92 commented 4 years ago

Starting on this now and moving files to ACF

Split into 2000 files. can be found here /lustre/haven/proj/UTK0032/projects/undergrads/annotations

CaseyRichards92 commented 4 years ago

BLAST 1-500

#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 1-500
#PBS -l nodes=1:ppn=2
#PBS -l walltime=08:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/proj/UTK0032/projects/undergrads/annotations/cds_splits/mRNA.fasta.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_sprot.fasta \
 -out /lustre/haven/proj/UTK0032/projects/undergrads/annotations/blast/swissprot/mRNA_sprot.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5

501-1000

#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 501-1000
#PBS -l nodes=1:ppn=2
#PBS -l walltime=08:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/proj/UTK0032/projects/undergrads/annotations/cds_splits/mRNA.fasta.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_sprot.fasta \
 -out /lustre/haven/proj/UTK0032/projects/undergrads/annotations/blast/swissprot/mRNA_sprot.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5
CaseyRichards92 commented 4 years ago

qsub automated_annotation.qsh 2680517[].apollo-acf [cricha59@acf-login8 swissprot]$ qsub automated_annotation2.qsh 2680518[].apollo-acf

CaseyRichards92 commented 4 years ago

Trembl 1-500

#PBS -N casey_trembl_1
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 1-500
#PBS -l nodes=1:ppn=2
#PBS -l walltime=15:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/proj/UTK0032/projects/undergrads/annotations/cds_splits/mRNA.fasta.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_trembl_plants_July_2018.fasta \
 -out /lustre/haven/proj/UTK0032/projects/undergrads/annotations/blast/trembl/mRNA_sprot.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5

501-1000

#PBS -N casey_trembl_1
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 501-1000
#PBS -l nodes=1:ppn=2
#PBS -l walltime=15:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/proj/UTK0032/projects/undergrads/annotations/cds_splits/mRNA.fasta.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_trembl_plants_July_2018.fasta \
 -out /lustre/haven/proj/UTK0032/projects/undergrads/annotations/blast/trembl/mRNA_sprot.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5
CaseyRichards92 commented 4 years ago

qsub annotations_trembl1.qsh 2680522[].apollo-acf [cricha59@acf-login8 trembl]$ qsub annotations_trembl2.qsh 2680523[].apollo-acf

CaseyRichards92 commented 4 years ago

@patricksis @RaymondS1 We still need to do swissprot and trembl runs 1001-2000 and all of IPS

CaseyRichards92 commented 4 years ago

Currently running BLAST and Trembl jobs 1000-2000

CaseyRichards92 commented 4 years ago

IPS

#PBS -A ACF-UTK0011
#PBS -S /bin/bash
#PBS -t 1-500
#PBS -j oe
#PBS -l nodes=1:ppn=4
#PBS -l walltime=3:30:00

cd $PBS_O_WORKDIR

module load python3

/lustre/haven/gamma/staton/software/interproscan-5.34-73.0/interproscan.sh \
 -i /lustre/haven/proj/UTK0032/projects/undergrads/annotations/pep_splits/polypeptide.fasta.$PBS_ARRAYID \
 -f XML \
 -d /lustre/haven/proj/UTK0032/projects/undergrads/annotations/ips/xmls \
 --disable-precalc \
 --iprlookup \
 --goterms \
 --pathways \
 --tempdir /lustre/haven/proj/UTK0032/projects/undergrads/annotations/ips/tmp \
 > /lustre/haven/proj/UTK0032/projects/undergrads/annotations/ips/tmp/$PBS_ARRAYID.out
CaseyRichards92 commented 4 years ago

@almasaeed2010 are these automated annotations being loaded to the site? They are complete on the ACF