statonlab / hardwoods_site

Hardwoods Genomics bugs, data loading, and general issues
GNU General Public License v3.0
2 stars 1 forks source link

Juglans Nigra #430

Open RaymondS1 opened 6 years ago

RaymondS1 commented 6 years ago

https://treegenesdb.org/FTP/Genomes/Juni/ Nigra

RaymondS1 commented 6 years ago

/var/www/html/sites/default/files/sequences/juglans_nigra/juglans_nigra_cds.fasta path to CDS fasta /var/www/html/sites/default/files/sequences/juglans_nigra/juglans_nigra_prot.fasta path to protien fasta

almasaeed2010 commented 5 years ago

There is already a transcriptome loaded for this organism. I am not sure what the process should be to replace that with a genome. I am aware that in the past we simply deleted old features and loaded new ones. But now with ES, we can keep old features and relate them to new features so they stay searchable.

Let's discuss during our next meeting.

For the time being @RaymondS1 move on to a different organism please.

CaseyRichards92 commented 5 years ago

@almasaeed2010 Have we figured out a way ahead with this?

almasaeed2010 commented 5 years ago

I think we can simply create a new genome assembly then load this genome. Let's try that.

CaseyRichards92 commented 5 years ago

Ill start on it. Sorry @RaymondS1 I'm stealing your old forgotten Genome.

CaseyRichards92 commented 5 years ago

@almasaeed2010 There is already a genome for this. https://www.hardwoodgenomics.org/organism/Juglans/hindsii?tripal_pane=group_summary_tripalpane Am I doing a transcriptome or genome?

almasaeed2010 commented 5 years ago

That's the wrong organism @cricha59 This is the one you want https://www.hardwoodgenomics.org/organism/Juglans/nigra?tripal_pane=group_transcriptome

CaseyRichards92 commented 5 years ago

Blast

#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 1-200
#PBS -l nodes=1:ppn=2
#PBS -l walltime=04:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/cds_splits/juni.1_0.cds.fa.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_sprot.fasta \
 -out /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/blast/swissprot/nigra_swissprot.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5
CaseyRichards92 commented 5 years ago

Trembl

#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 1-200
#PBS -l nodes=1:ppn=2
#PBS -l walltime=15:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/cds_splits/juni.1_0.cds.fa.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_trembl_plants_July_2018.fasta \
 -out /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/blast/trembl/nigra_trembl.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5
CaseyRichards92 commented 5 years ago

IPS

#PBS -A ACF-UTK0011
#PBS -S /bin/bash
#PBS -t 1-200
#PBS -j oe
#PBS -l nodes=1:ppn=4
#PBS -l walltime=3:30:00

cd $PBS_O_WORKDIR

module load python3

/lustre/haven/gamma/staton/software/interproscan-5.34-73.0/interproscan.sh \
 -i /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/peptide_splits/juni.1_0.peptides.fa.$PBS_ARRAYID \
 -f XML \
 -d /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/blast/IPS/xmls \
 --disable-precalc \
 --iprlookup \
 --goterms \
 --pathways \
 --tempdir /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/blast/IPS/tmp \
 > /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/blast/IPS/tmp/$PBS_ARRAYID.out
CaseyRichards92 commented 5 years ago

Added to 6 juglans species reference genome https://www.hardwoodgenomics.org/Genome-assembly/2209433

CaseyRichards92 commented 5 years ago

CDS FASTA Loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/811650

CaseyRichards92 commented 5 years ago

Peptide FASTA Loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/811654

CaseyRichards92 commented 5 years ago

Published mRNA-Polypeptide https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/811655

CaseyRichards92 commented 5 years ago

BLAST Annotations swissprot https://www.hardwoodgenomics.org/BLAST-annotation/3553032 trembl https://www.hardwoodgenomics.org/BLAST-annotation/3553033

CaseyRichards92 commented 5 years ago

IPS Annotations https://www.hardwoodgenomics.org/InterProScan-annotation/3553034

CaseyRichards92 commented 5 years ago

Swissprot Loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/819584

CaseyRichards92 commented 5 years ago

IPS Loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/819583

CaseyRichards92 commented 5 years ago

Combining Junglans nigra genome organism with the Junglans nigra organism and specifying which is from the genome and which is from the transcriptome.

CaseyRichards92 commented 5 years ago

New CDS loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/819574 New Peptide Loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/819578 New mrna-polypeptide published records https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/819579

CaseyRichards92 commented 5 years ago

KEGG Annotation https://www.hardwoodgenomics.org/KEGGresults/3613525

CaseyRichards92 commented 5 years ago

Trembl split into 400 and re ran

#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 1-200
#PBS -l nodes=1:ppn=2
#PBS -l walltime=12:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/trembl_splits/juni.1_0.cds.fa.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_trembl_plants_July_2018.fasta \
 -out /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/blast/trembl/nigra_trembl.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5

201-400


#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 201-400
#PBS -l nodes=1:ppn=2
#PBS -l walltime=12:00:00

cd $PBS_O_WORKDIR

module load blast

blastx \
 -query /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/trembl_splits/juni.1_0.cds.fa.$PBS_ARRAYID \
 -db /lustre/haven/gamma/staton/library/uniprot/uniprot_trembl_plants_July_2018.fasta \
 -out /lustre/haven/gamma/staton/projects/undergrads/juglans_nigra/blast/trembl/nigra_trembl.$PBS_ARRAYID.xml \
 -evalue 1e-5 \
 -outfmt 5
CaseyRichards92 commented 5 years ago

KEGG Loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/819586

CaseyRichards92 commented 5 years ago

Trembl Loader https://www.hardwoodgenomics.org/admin/tripal/tripal_jobs/view/819585

CaseyRichards92 commented 5 years ago

BLAST DB CDS https://www.hardwoodgenomics.org/content/juglans-nigra-genome-transcripts BLAST DB PEPTIDES https://www.hardwoodgenomics.org/content/juglans-nigra-genome-peptides BLAST DB Scaffolds https://www.hardwoodgenomics.org/content/juglans-nigra-genome-scaffolds

CaseyRichards92 commented 5 years ago

Unable to obtain a GFF for JBrowse

CaseyRichards92 commented 5 years ago

KEGG Loader has failed SQLSTATE[25P02]: In failed sql transaction: 7 ERROR: current transaction is aborted, commands ignored until end of transaction block

almasaeed2010 commented 5 years ago

Error fixed and job loader is running normally.

CaseyRichards92 commented 5 years ago

Convert .gtf to .gff for JBrowse. Also make downloadable.

almasaeed2010 commented 5 years ago

docs for GTF conversion http://jbrowse.org/docs/faq_data_loading.html#how-do-i-convert-gtf-to-gff

CaseyRichards92 commented 5 years ago

Recieved this error when running command /staton/software/cufflinks-2.2.1.Linux_x86_64/gffread -G -E juni.1_0.gtf -o- > juni.1_0.gff3

Warning: could not parse ID or Parent from GFF line:
scaffold194     GFACS   start_codon     428684  428686  .       -       .       .
Warning: could not parse ID or Parent from GFF line:
scaffold194     GFACS   stop_codon      425734  425736  .       -       .       .
Warning: could not parse ID or Parent from GFF line:
scaffold1390    GFACS   gene    127322  158903  0.81    -       .       Juni_21554.t1

"