Closed Fatima-Zare closed 2 years ago
Hi @Fatima-Zare
Q1: The splatPopSimulate function will simulate scRNA-seq data for every individual in the provided vcf. So by specifying mockVCF(n.samples = 100) and providing that output to splatPopSimulate, you will generate data for 100 samples. Note that the mockVCF function is quite basic, to generate variant data that has more realistic LD/population structure consider using something like HAPGEN2 or sim1000G. Alternatively you can provide splatPop with genotype data from real donors using data from public repositories (e.g., GTEx).
Q2: The code you provided above should be doing exactly this. You can confirm by inspecting the gene means that are simulated for each individual for each gene for each cell-group. For example:
metadata(sim.sc.gr)$Simulated_Means$Group1[1:5, 1:10]
metadata(sim.sc.gr)$Simulated_Means$Group2[1:5, 1:10]
You can also see exactly what celltype specific DE effects are being added by inspecting the rowData:
rowData(sim.sc.gr)[1:5,grep(".GroupEffect", names(rowData(sim.sc.gr)))]
Thanks for using splatter and let us know if you have more questions.
Thank you for your reply.
For example, If I want to increase the variance of the vector of
metadata(sim.sc.gr)$Simulated_Means$Group1[1, 1:100], or metadata(sim.sc.gr)$Simulated_Means$Group1[2, 1:100], or metadata(sim.sc.gr)$Simulated_Means$Group1[3, 1:100], and etc, how should I change the parameters?
Currently the variance between individuals in your simulated population is quite low because you have set similarity.scale = 8. That parameter impacts the shape of the gamma distribution that the coefficient of variation (CV) for gene gene is sampled from, where a larger similarity.scale value results in smaller CVs and thus less variation between individuals. To increase variance try decreasing the similarity.scale.
Thank you for your reply. By changing similarity.scale, Now, I have a bigger variance and higher variation between individuals, for example for the following vector:
metadata(sim.sc.gr)$Simulated_Means$Group1[1, 1:100] and also for all other rows of matrix X.
X=metadata(sim.sc.gr)$Simulated_Means$Group1.
@Fatima-Zare was @azodichr able to answer your question? Just wanted to check before I close this issue.
Dear Luke, Christina helped me get through my problem and you can close the issue. Thank you again for your help.
Best, Fatima
On Thu, Feb 3, 2022 at 4:44 AM Luke Zappia @.***> wrote:
Message sent from a system outside of UConn.
@Fatima-Zare https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FFatima-Zare&data=04%7C01%7C%7C109f9a8587bc45f6a95908d9e6f9c296%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C637794782657919552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=xzOpFKweFgF%2B2KyIqSV6ZqquRr5bClFyViQiDMwokWk%3D&reserved=0 was @azodichr https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fazodichr&data=04%7C01%7C%7C109f9a8587bc45f6a95908d9e6f9c296%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C637794782657919552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=oRykS38TVlM2oj1oFiX%2BWcVxW4748D5nS59RRze2toU%3D&reserved=0 able to answer your question? Just wanted to check before I close this issue.
— Reply to this email directly, view it on GitHub https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FOshlack%2Fsplatter%2Fissues%2F127%23issuecomment-1028793159&data=04%7C01%7C%7C109f9a8587bc45f6a95908d9e6f9c296%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C637794782657919552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=zDnUNXTDmbPqsR0lIOBc61Fslc7vHX%2FFfYomHlbHMKY%3D&reserved=0, or unsubscribe https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAETJWGHCB2TURGMMX3PMM3LUZJE7PANCNFSM5H5MG3AA&data=04%7C01%7C%7C109f9a8587bc45f6a95908d9e6f9c296%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C637794782657919552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=AJV5cmfQFNFhu%2FWyPdgbHvBUY%2BFPSh%2Fyn4vlMVOj%2BJQ%3D&reserved=0 . Triage notifications on the go with GitHub Mobile for iOS https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&data=04%7C01%7C%7C109f9a8587bc45f6a95908d9e6f9c296%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C637794782657919552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=U%2F4itwkhI2qwMhqjIt0VIgjcH1L8B%2Bdz2qYnoHMp1hE%3D&reserved=0 or Android https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26referrer%3Dutm_campaign%253Dnotification-email%2526utm_medium%253Demail%2526utm_source%253Dgithub&data=04%7C01%7C%7C109f9a8587bc45f6a95908d9e6f9c296%7C17f1a87e2a254eaab9df9d439034b080%7C0%7C0%7C637794782657919552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=tKIKkdAySsolh8E%2FKz3SCDphTIiAs%2FjsX8rKTIANlm0%3D&reserved=0.
You are receiving this because you were mentioned.Message ID: @.***>
I’m using the Splatter to generate single cell simulated data. I need to have a variability in samples which means that expression levels of genes should change across samples. I have 100 samples, 20 genes, 5 cell types and my code to generate single cell data is :
I have two question: 1-Is there any other way that I can generate 100 samples? 2-Also, I want that gene expression level of genes change between individual samples. for example, gene expression level of Gene1 in celltypeA for sample1 should be different from gene expression level of Gene1 in celltypeA for sample 2 and etc. Is there anyway I can have this property?