Develop the deep learning model for cell free DNA analysis

cuhk-haosun / course-MBI6013

Material for Msc. research project MBI6013

GNU General Public License v3.0

0 stars 2 forks source link

Develop the deep learning model for cell free DNA analysis #7

Open 223050025 opened 7 months ago

223050025 commented 7 months ago

Talking with Dr.Sun to ensure the research topic, read the paper "DNA methylation analysis explores the molecular basis of plasma cell-free DNA fragmentation. Nature Communications, 14(287), https://doi.org/10.1038/s41467-023-35959-6".
Run the code with public dataset first. The final goal is to try to use deep learning model to classify and analyse the cell-free DNA dataset, and build the docker.

223050025 commented 7 months ago

Public cfDNA whole genome sequencing datasets: GSE71378, GSE124686, GSE81314
WGBS dataset: CRA001537

Milokita commented 7 months ago

Public cfDNA whole genome sequencing datasets: GSE71378, GSE124686, GSE81314

WGBS dataset: CRA001537

pls specify the save path of your data

223050025 commented 6 months ago

Try to use SRA ToolKit to download fastq files. As example, GSE71378's raw file id is SRR061633, comment "fastq-dump --split-files SRR061633" will split two original paired-end reads (paired sequencing sequences) into two files, with the first and second sequences of each paired reads stored separately.

The data is saved by /share/home/grp-sunhao/liyixiao/SRR061633_1.fastq

223050025 commented 6 months ago

@Milokita 师兄，请问如何下载EGA数据库里的数据

Milokita commented 6 months ago

In short, if it's controlled dataset, you need to write an application otherwise you can simply download it