morris-lab / Capybara

Capybara: A computational tool to measure cell identity and fate transitions
56 stars 9 forks source link

ARCSH4 mining #10

Closed Danyi-ZHENG closed 2 years ago

Danyi-ZHENG commented 2 years ago

`# Use Ensembl to extract gene length information based on Ensembl ID library(biomaRt) gene.info <- read.table("gene_info.tsv", header=T, stringsAsFactors=F)

mart <- useMart("ensembl",dataset="mmusculus_gene_ensembl") ensembl.gene <- gene.info$gene.ensembl[which(!is.na(gene.info$gene.ensembl))] ensembl.map.rslt <- getBM(attributes = c("mgi_symbol", "start_position", "end_position", "ensembl_gene_id_version", "external_gene_name", "ensembl_gene_id"), filters = "ensembl_gene_id", values = ensembl.gene, mart = mart) ensembl.map.rslt <- unique(ensembl.map.rslt[,c(2:6)]) rownames(ensembl.map.rslt) <- ensembl.map.rslt$ensembl_gene_id

Calculate length of the gene

ensembl.map.rslt$gene.length <- abs(ensembl.map.rslt$end_position - ensembl.map.rslt$start_position)/1000

gene.info$gene.length <- ensembl.map.rslt[gene.info$gene.ensembl, "gene.length"]

gene.info.sub <- gene.info[which(!is.na(gene.info$gene.length)), ] rownames(gene.info.sub) <- gene.info.sub$gene.sym `

Hi! May I get to know what is this gene_info tsv? I am trying to create a human bulk reference with the file "https://s3.amazonaws.com/mssm-seq-matrix/human_matrix_v11.h5", and stuck at the above step. Btw, for a high-resolution reference, is there a human version of cell atlas recommeneded? Thank you!

KaetheKong commented 2 years ago

Hello,

I have just uploaded the gene_info.tsv into the Capybara/examples folder. Please find it there and let us know if that is helpful! It is basically a tab-delimited file with Ensembl id and the corresponding gene symbols.

For the human cell atlas, one of the resources coming from the same lab that created MCA is the Human Cell Landscape (HCL). This could be a good place to start with and I think there are many other human cell atlases that could be useful!

Hope this is helpful! Wenjun

Danyi-ZHENG commented 2 years ago

Thank you! I will have a look and get back to you asap. Really thx for your support!

BEST, Miley ZHENG Danyi

Miley ZHENG, Danyi Ph.D. Student | Level 5 NBD Laboratory of Reconstructive Neurobiology | Duke-NUS Medical School, Singapore


发件人: KaetheKong @.> 发送时间: Wednesday, April 20, 2022 3:55:24 PM 收件人: morris-lab/Capybara @.> 抄送: Zheng Danyi @.>; Author @.> 主题: Re: [morris-lab/Capybara] ARCSH4 mining (Issue #10)

    - External Email -

Hello,

I have just uploaded the gene_info.tsv into the Capybara/examples folder. Please find it there and let us know if that is helpful! It is basically a tab-delimited file with Ensembl id and the corresponding gene symbols.

For the human cell atlas, one of the resources coming from the same lab that created MCA is the Human Cell Landscape (HCLhttps://db.cngb.org/HCL/). This could be a good place to start with and I think there are many other human cell atlases that could be useful!

Hope this is helpful! Wenjun

― Reply to this email directly, view it on GitHubhttps://github.com/morris-lab/Capybara/issues/10#issuecomment-1103584087, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AWDM5NH4CYE7VMALSR4YJKLVF6Z6ZANCNFSM5TSHUIEA. You are receiving this because you authored the thread.Message ID: @.***>

KaetheKong commented 2 years ago

Hi Miley,

No problem at all! I will close this issue for now and feel free to open a new one if you have more questions!

Wenjun