Open teresa-m opened 3 years ago
Ideas from Rolf. Not sure if I summarized it corretly?
https://dorina.mdc-berlin.de/regulators -> get CLIP data from here or from Dominik
paper: RBP coverage: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38355 mRNA-Seq: https://www.ncbi.nlm.nih.gov/gds?LinkName=biosample_gds&from_uid=997837 ... maybe better total RNA-Seq to also get ncRNAs
Downloade Data: full mRNAseq: Run | Assay Type | AvgSpotLen | Bases | BioProject | BioSample | Bytes | Center Name | Consent | DATASTORE filetype | DATASTORE provider | DATASTORE region | Experiment | GEO_Accession (exp) | Instrument | Library Name | LibraryLayout | LibrarySelection | LibrarySource | Organism | Platform | purification | ReleaseDate | Sample Name | source_name | SRA | Study | Treatment |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SRR500121 | RNA-Seq | 36 | 1368141804 | PRJNA167851 | SAMN00997837 | 847624134 | GEO | public | "sra fastq" | "gs s3 ncbi" | "gs.US s3.us-east-1 ncbi.public" | SRX149162 | GSM936076 | Illumina Genome Analyzer II | GSM936076: RNAseq_mRNA | SINGLE | cDNA | TRANSCRIPTOMIC | Homo sapiens | ILLUMINA | oligo(dT) | 2012-06-06T00:00:00Z | GSM936076 | HEK293 cell culture | SRP013463 | no treatment (mRNA) |
Next steps will be to gerate the following position files:
Generate potenetal negative mRNA binding sides by filter out all positions of 2 and 3 in one
Next step: How to generate the negative RRIs?
Task on how to construct the data set are written here: Generate trainings data using context #25
Idea is to:
The bining profile can be found her for human: https://doi.org/10.1016/j.molcel.2012.05.021