IBP-21-22 / intbioproj

1 stars 0 forks source link

NLP 3' UTR #18

Open LoreVanSantvliet opened 2 years ago

LoreVanSantvliet commented 2 years ago
VasLem commented 2 years ago

I found and downloaded the following resource: http://sgd-archive.yeastgenome.org/sequence/S288C_reference/SGD_all_ORFs_3prime_UTRs.fsa.zip

with the following description:

SGD_all_ORFs_3prime_UTRs.README

Information about the SGD_all_ORFs_3prime_UTRs.fsa file.

This file contains sequences for predicted 3' UTR sequences for all characterized, uncharacterized and dubious ORFs in S288c.

Coordinates from both the longest transcripts covering an ORF in yeast grown in YPD or GAL (S288C transcriptome set derived from Pelechano et al, PMID:23615609) and all ORFs were used to determine the 3' UTR coordinates.

A BED file with these coordinates was made. Sequences were batch downloaded in FASTA format from USCS Genome
Table Browser on 03-09-2021 (https://genome.ucsc.edu/cgi-bin/hgTables?hgsid=604915183_aUc9FZ1vwyAccGti1uI0wteYEM7x)

file created: 2017-08-25
last modified: 2021-03-19
LoreVanSantvliet commented 2 years ago

To do: