morris-lab / CellOracle

This is the alpha version of the CellOracle package
Other
298 stars 50 forks source link

Can we annotate TSS using scATAC-seq datasets? #151

Open yangjun9095 opened 1 year ago

yangjun9095 commented 1 year ago

Dear Kenji,

Thank you for sharing and maintaining this great tool, CellOracle, with great documentation and tutorials!

I'm trying to build GRNs in early zebrafish embryos using our own single-cell multiome datasets (RNA+ATAC), with the goal of identifying key regulatory TFs in the early cell fate specification. I followed the CellOracle tutorials and got a GRN, which predicted the in silico KO experiments for some genes as we expected. Although this GRN is already great, I realized that my base GRN misses some key regulatory genes in the "target" rows since their Transcription Start Sites (TSS) were not annotated. For example, I'm missing "sox2", "tbxta", and "meox1", whose upstream TFs I'm interested in.

I think this might be beyond CellOracle's support/scope, but I wanted to get your thoughts on it. Since scATAC-seq data gives us peaks for those genes, whether we can pick one of the peaks and define it as "TSS/promoter". Or, if you have any thoughts on potentially improving the TSS annotation beyond the Ensemble/UCSC annotation. For example, you mentioned in this Issue (https://github.com/morris-lab/CellOracle/issues/88), that "Celloracle is using genomepy to convert a bed file to a fasta file, and the TSS annotation usually uses information obtained from Ensembl or homer database.". Anyway, we appreciate any guidance from you. Thank you! Below is a quick description of my CellOracle environment.

CellOracle version: 0.14.0 Reference genome for TSS annotation: "danRer11" (UCSC) number of TSS annotations: 17800 number of genes with TSS annotated: 15299

Best, Yang-Joon

yangjun9095 commented 10 months ago

Dear @KenjiKamimoto-wustl122,

Hi! Just wanted to ping you again on this one! Thank you for sharing this great tool with the community!

Best, Yang-Joon