Closed Danyi-ZHENG closed 2 years ago
Hello,
I have just uploaded the gene_info.tsv into the Capybara/examples folder. Please find it there and let us know if that is helpful! It is basically a tab-delimited file with Ensembl id and the corresponding gene symbols.
For the human cell atlas, one of the resources coming from the same lab that created MCA is the Human Cell Landscape (HCL). This could be a good place to start with and I think there are many other human cell atlases that could be useful!
Hope this is helpful! Wenjun
Thank you! I will have a look and get back to you asap. Really thx for your support!
BEST, Miley ZHENG Danyi
Miley ZHENG, Danyi Ph.D. Student | Level 5 NBD Laboratory of Reconstructive Neurobiology | Duke-NUS Medical School, Singapore
发件人: KaetheKong @.> 发送时间: Wednesday, April 20, 2022 3:55:24 PM 收件人: morris-lab/Capybara @.> 抄送: Zheng Danyi @.>; Author @.> 主题: Re: [morris-lab/Capybara] ARCSH4 mining (Issue #10)
- External Email -
Hello,
I have just uploaded the gene_info.tsv into the Capybara/examples folder. Please find it there and let us know if that is helpful! It is basically a tab-delimited file with Ensembl id and the corresponding gene symbols.
For the human cell atlas, one of the resources coming from the same lab that created MCA is the Human Cell Landscape (HCLhttps://db.cngb.org/HCL/). This could be a good place to start with and I think there are many other human cell atlases that could be useful!
Hope this is helpful! Wenjun
― Reply to this email directly, view it on GitHubhttps://github.com/morris-lab/Capybara/issues/10#issuecomment-1103584087, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AWDM5NH4CYE7VMALSR4YJKLVF6Z6ZANCNFSM5TSHUIEA. You are receiving this because you authored the thread.Message ID: @.***>
Hi Miley,
No problem at all! I will close this issue for now and feel free to open a new one if you have more questions!
Wenjun
`# Use Ensembl to extract gene length information based on Ensembl ID library(biomaRt) gene.info <- read.table("gene_info.tsv", header=T, stringsAsFactors=F)
mart <- useMart("ensembl",dataset="mmusculus_gene_ensembl") ensembl.gene <- gene.info$gene.ensembl[which(!is.na(gene.info$gene.ensembl))] ensembl.map.rslt <- getBM(attributes = c("mgi_symbol", "start_position", "end_position", "ensembl_gene_id_version", "external_gene_name", "ensembl_gene_id"), filters = "ensembl_gene_id", values = ensembl.gene, mart = mart) ensembl.map.rslt <- unique(ensembl.map.rslt[,c(2:6)]) rownames(ensembl.map.rslt) <- ensembl.map.rslt$ensembl_gene_id
Calculate length of the gene
ensembl.map.rslt$gene.length <- abs(ensembl.map.rslt$end_position - ensembl.map.rslt$start_position)/1000
gene.info$gene.length <- ensembl.map.rslt[gene.info$gene.ensembl, "gene.length"]
gene.info.sub <- gene.info[which(!is.na(gene.info$gene.length)), ] rownames(gene.info.sub) <- gene.info.sub$gene.sym `
Hi! May I get to know what is this gene_info tsv? I am trying to create a human bulk reference with the file "https://s3.amazonaws.com/mssm-seq-matrix/human_matrix_v11.h5", and stuck at the above step. Btw, for a high-resolution reference, is there a human version of cell atlas recommeneded? Thank you!