Open jjacobi3 opened 1 year ago
Hi Justine,
Were these datasets collected in one or multiple batches? Will technical differences between samples be a problem?
AL
From: jjacobi3 @.> Sent: Monday, March 6, 2023 11:38 AM To: cistrome/MIRA @.> Cc: Subscribed @.***> Subject: [cistrome/MIRA] Question for multiple datasets (Issue #19)
Hi, I have multiple 10x multiome datasets that span a time course. I was hoping you could give some advice or the recommend a proper workflow for training the models together?
Thanks!
— Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F19&data=05%7C01%7C%7Cddd4eb23bad845959af008db1e69a23b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638137211260916895%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2B09Ckp%2BmHGktEo%2BfkgQnB%2FCk9yzugiIj6a1xKgVGxfU%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPGSLSRR46HD7A3AIMLW2YOKHANCNFSM6AAAAAAVRNSWRY&data=05%7C01%7C%7Cddd4eb23bad845959af008db1e69a23b%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638137211260916895%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ONLl72pUHDDD2coLgx61XLBARJkk3kfM%2Bziu1JGMufA%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi Allen!
Our samples were collected in separate batches, so we'd like to process them individually as well as together to see a trajectory. We have processed the samples already using Seurat/Signac so we know the minimal technical variation across samples, but we'd like to use mira since it would be best for addressing our specific hypotheses.
Thanks! -- Justine
Ah I see.
To stitch together a trajectory will be challenging depending on the strength of the technical effects. If they are severe, the trajectory will be distorted. If not, applying MIRA as if this were a single dataset would be easiest, and you can simply learn topics, etc, over all of the batches simultaneously. This would be my first attempt.
Also, we have developed a batch correction method for MIRA that performs quite well and is currently under review. That would be the most ideal model for your data, but unfortunately I am a couple of weeks away from publishing code. If the strategy above does not work because of technical effects, I can try to get the batch correcting model ready to go for you to try.
AL
From: jjacobi3 @.> Sent: Sunday, March 12, 2023 3:43 PM To: cistrome/MIRA @.> Cc: AllenWLynch @.>; Comment @.> Subject: Re: [cistrome/MIRA] Question for multiple datasets (Issue #19)
Hi Allen!
Our samples were collected in separate batches, so we'd like to process them individually as well as together to see a trajectory. We have processed the samples already using Seurat/Signac so we know the minimal technical variation across samples, but we'd like to use mira since it would be best for addressing our specific hypotheses.
Thanks! -- Justine
— Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F19%23issuecomment-1465295125&data=05%7C01%7C%7Cc2b1a9adce0a4bb5fa7508db233a7f18%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638142506368811097%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qNv6%2BDtWe5tr4dQh8t5mHTTNNZPxK2C%2BVWXy6cGRC0w%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPHBE7EC6K4R2UIDD2TW3Y7RVANCNFSM6AAAAAAVRNSWRY&data=05%7C01%7C%7Cc2b1a9adce0a4bb5fa7508db233a7f18%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638142506368967336%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=oyE0bbx0yUQam%2FyEvNboOrcEgmCI5JOf19ZO0b4UMXU%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>
Hi Allen,
Thanks for the advice! The technical effects are minimal so I'd like to try and apply MIRA as if it were a single dataset, as a first attempt.
Do you recommend merging all the samples together prior to using MIRA or is there a workflow that you recommend within the preprocessing steps of MIRA?
Thanks again!
In this case,
I would recommend merging the samples together before running MIRA. For expression data, this is easy because you can just merge on gene features and select highly variable genes across datasets. For accessibility data it can be more challenging since you have to call some standardized peakset.
I would recommend piling-up your various datasets into a big-wig file and calling peaks using MACS!
AL
Hi, I have multiple 10x multiome datasets that span a time course. I was hoping you could give some advice or the recommend a proper workflow for training the models together?
Thanks!