Open mestato opened 6 years ago
http://160.36.205.61:9090/bio_data/418 Link to organism
http://160.36.205.61:9090/bio_data/419 Analysis of organism
http://160.36.205.61:9090/admin/tripal/tripal_jobs/view/24344
Regular expression for proteins:
>(FS[0-9a-zA-Z]*)
http://160.36.205.61:9090/admin/tripal/tripal_jobs/view/24357
http://160.36.205.61:9090/admin/tripal/tripal_jobs/view/24384 Link to publishing job
Path to CDS:
/home/www/sites/default/files/sequences/european_beech/Fagus_sylvatica_cds_v1.3.fasta
Path to Proteins:
/home/www/sites/default/files/sequences/european_beech/Fagus_sylvatica_prot_v1.3.fasta
@RaymondS1 you are ready to publish your genes.
https://hardwoods.ag.utk.edu/admin/tripal/tripal_jobs/view/677307 Link to protein job
https://hardwoods.ag.utk.edu/admin/tripal/tripal_jobs/view/677336 published gene records
Swissprot blast for european beech
#PBS -N swissprot_blast
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 1-200
#PBS -l nodes=1:ppn=2
#PBS -l walltime=02:00:00
cd $PBS_O_WORKDIR
module load blast
blastx \
-query /lustre/haven/gamma/staton/projects/undergrads/european_beech/splits/Fagus_sylvatica_cds_v1.3.fasta.$PBS_ARRAYID \
-db /lustre/haven/gamma/staton/library/uniprot/uniprot_sprot.fasta \
-out /lustre/haven/gamma/staton/projects/undergrads/european_beech/blast/swissprot/european_beech_sprot_$PBS_ARRAYID.xml \
-evalue 1e-5 \
-outfmt 5
Trembl Blast
#PBS -N trembl_blast
#PBS -S /bin/bash
#PBS -j oe
#PBS -A ACF-UTK0011
#PBS -t 1-200
#PBS -l nodes=1:ppn=2
#PBS -l walltime=04:00:00
cd $PBS_O_WORKDIR
module load blast
blastx \
-query /lustre/haven/gamma/staton/projects/undergrads/european_beech/splits/Fagus_sylvatica_cds_v1.3.fasta.$PBS_ARRAYID \
-db /lustre/haven/gamma/staton/library/uniprot/uniprot_trembl_plants_July_2018.fasta \
-out /lustre/haven/gamma/staton/projects/undergrads/european_beech/blast/trembl/european_beech_trembl_$PBS_ARRAYID.xml
\
-evalue 1e-5 \
-outfmt 5
Next Steps:
https://hardwoods.ag.utk.edu/admin/tripal/tripal_jobs/view/710317 New protein fasta. Correction
>FSB010000101
ATGGCACCAACTATGTATATTGTCTATCTTCGCTTCAATGGGGAGATTATTTATGGTCAACATGGAGCTG
AGTATCAAGGGTCGCAAATGAAGTTCATCCGGGTTCATCGTGGGATTAGTTTTGTCGAATTGGAAACGAA
GATATTCAATGCACTACAATTGGACAATCAATCTCATCGTATAACAGTTACATACCGTTGTCCTCAGGAG
GTGATTTCACCTCACATTAATTACATGACTCTATTGATAACAGACGACGACGGTGTTAATCTCATGTTTG
ACATGTTAGATGCAACGCCTGAATTAAAAGGTATTGAGTTATATATAAGTGTGGAGGATTGTGTTGGTGA
AGGTGTTGAGCCTCTTACACAAGATGATGGGGATGGATTAGTAGCGGAAGATTGTGTTGGTGAAGATGTA
CAACAAATGACTGTGCATGATACTGCTCCTTCGACACAACCCTCTACACTTGGAAGGTGTACACCACAAT
TACATGAGATACGAACATCGGTGGAGGATTGTGGTCCCAGCACTCGACATGAGTATGTTCCATACGAGGT
AAACCCTTTAGCTGGAGTGCATGATACGATGATGTTGGAATGTACTGCTGATGATGAAGAAGAAAACGCT
>FSB011771501 kinase chloroplastic-like|protein serine/threonine kinase activity;ATP binding;protein phosphorylation;serine family amino acid metabolic process
mgncldssakvdtaqsshatsgsgiskfssktsrssapssltiqtfseksnasslpnprsegeilsspnlksfsfnelkn
atrnfrpdsllgeggfgyvfkgwidehsfsaakpgsgmvvavkklksegfqghkewltevnylgqlhhpnlvkligycle
genrllvyefmpkgslenhlfrrgpqplswairikvatgaarglcflhdaksqviyrdfkasnilldaefnaklsdfgla
kagptgdrthvstqvmgthgyaapeyvatgrltaksdvysfgvvllellsgrravdktkvvieqnlvdwakpylgdkrkl
frimdtklegqypqkgaytaatlalqclsneakgrprmaevlatleqldnpknagrpsqseqqtvapvrkspmrphhspr
nltpgasplpayrqsprvr
>FSB011771601 probable serine threonine- kinase NAK|protein serine/threonine kinase activity;ATP binding;protein phosphorylation;serine family amino acid metabolic process
mkvnkkdellhayrldcfyysvlkaatkkfscknllgeggfgdvykgyisyctmtaarpgcgfavavkrqrktgeqgvhe
wlneltflaglnhpnvvkligycsegdqrilvykymiggsleahllkadvtelnwrrrinialgaarglyflhtrgrpvi
@almasaeed2010 what regex should I use?
@RaymondS1 try this regular expression:
>(.*?)\w+
https://hardwoods.ag.utk.edu/BLAST-annotation/2360512 Blast Annotation
IPS
#PBS -N european_beech_ips
#PBS -A ACF-UTK0011
#PBS -S /bin/bash
#PBS -t 1-200
#PBS -j oe
#PBS -l nodes=1:ppn=4
#PBS -l walltime=3:30:00
cd $PBS_O_WORKDIR
/lustre/haven/gamma/staton/software/interproscan-5.28-67.0/interproscan.sh \
-i /lustre/haven/gamma/staton/projects/undergrads/european_beech/raw_data/ipssplits/Fagus_sylvatica_prot_v1.3.fasta.$PBS_ARRAYID \
-f XML \
-d /lustre/haven/gamma/staton/projects/undergrads/european_beech/ips/xml \
--disable-precalc \
--iprlookup \
--goterms \
--pathways \
--tempdir /lustre/haven/gamma/staton/projects/undergrads/european_beech/ips/tmp \
>& /lustre/haven/gamma/staton/projects/undergrads/european_beech/ips/tmp/$PBS_ARRAYID.out
https://hardwoods.ag.utk.edu/InterProScan-annotation/2418507 Link to InterPro Scan Annotation
https://hardwoods.ag.utk.edu/admin/tripal/tripal_jobs/view/718904 InterPro Scan file upload
@almasaeed2010 Delete the 'F. Sylvatica' analysis.
Task List
Any updates on this Organism? I think it only needs a couple of things before it can go live:
Reference Genome
descriptionLive Site https://hardwoodgenomics.org/admin/tripal/tripal_jobs/view/480523 mRNA publishing https://hardwoodgenomics.org/admin/tripal/tripal_jobs/view/486382 CDS Fasta Upload https://hardwoodgenomics.org/admin/tripal/tripal_jobs/view/480517 Protein Fasta https://hardwoodgenomics.org/admin/tripal/tripal_jobs/view/480506 Swissprot upload job link https://hardwoodgenomics.org/admin/tripal/tripal_jobs/view/486528 Trembl Upload https://hardwoodgenomics.org/admin/tripal/tripal_jobs/view/487173 Interproscan Upload
@RaymondS1 I see jobs on the live site for this organism that are not posted here. Please post the links to every submitted job here so we can inspect errors easily.
Currently this organism is missing the following:
Let's try to get these done. If you have any questions or need help with any of this please let me know.
@RaymondS1 so far so good! See above for updated list of completed tasks.
You are now only missing the following:
@RaymondS1 Everything looks good for this organism except for download links to the mRNA, and Polypeptide files along with the GFF file if one exists. If you don't know how to do this, please let me know.
Three items:
Genome and gff are available for Jbrowse
@RaymondS1
Add cross reference tho organism page and you are good to close
@RaymondS1 I added the cross reference and this issue can now be closed https://www.hardwoodgenomics.org/organism/Fagus/sylvatica
Reference genome for Fagus sylvatica: http://thines-lab.senckenberg.de/beechgenome/index2.htm
Reference manuscript: https://academic.oup.com/gigascience/article/7/6/giy063/5017772