Closed Connorr0 closed 1 year ago
Hi Connor,
Thanks for checking out MIRA! Just to clarify, from a trained topic model, you want to find the reconstruction loss on a per cell basis?
Allen
From: Connor Finkbeiner @.> Sent: Friday, October 14, 2022 6:44 PM To: cistrome/MIRA @.> Cc: Subscribed @.***> Subject: [cistrome/MIRA] Finding reconstructed gene by cell from different models (Issue #12)
I was hoping to look at the error in the final model on a per cell basis. I was hoping to get the reconstructed gene x cell matrices, or some measurement of loss for each cell. Is this information available in one of the objects/files in the pipeline?
To be clear, I am not thinking about just the topic values coding layer, I was hoping to get the output after the best model attempts to decode the compressed encoding.
— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F12&data=05%7C01%7C%7Cc49084b767104f81487608daae3e0f6f%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638013878807074321%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=cUwUYt105QxCikND7zlqf3Ud8%2Bfd09B4J4ld%2F8%2FCKaU%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPAU3S32BD7KBLAWS7TWDHV6PANCNFSM6AAAAAARFU75SI&data=05%7C01%7C%7Cc49084b767104f81487608daae3e0f6f%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638013878807074321%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0Jko8AERDrjw6XnnQ21V7wV1eMasOoeBgnONy%2B5K9dE%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Yes, I want to look across cell types to see if the reconstruction was worse for any of my cell types.
Also I'm excited to play around with MIRA!
Hi Connor,
Unfortunately, the current API does not support seeing per cell losses. I am working on a substantial update and will consider adding this feature.
Allen
From: Connor Finkbeiner @.> Sent: Monday, October 17, 2022 12:03 PM To: cistrome/MIRA @.> Cc: AllenWLynch @.>; Comment @.> Subject: Re: [cistrome/MIRA] Finding reconstructed gene by cell from different models (Issue #12)
Yes, I wanted to look across cell types to see if the encoding was worse for any of my cell types.
— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F12%23issuecomment-1281178798&data=05%7C01%7C%7Cec75cfca0a6f41f36ff308dab061851e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638016230131299586%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LvzHE7BI0jphadzUkfE2Hfrx4%2BKknWbV7jg88RRwRXA%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPFVRF546TRGV5HFFDLWDWBGHANCNFSM6AAAAAARFU75SI&data=05%7C01%7C%7Cec75cfca0a6f41f36ff308dab061851e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638016230131299586%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7eRK4dILPYqveF3Ow48RxCNaW%2BDrGHVBOhHpUO6oHgE%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>
Thanks for the response!
Also, if I wanted to get an estimate of the reconstruction accuracy across cells would anndata[counts] vs model.impute() be meaningful (once I column normalized counts)?
It may be informative if you normalize counts, but likely only for very highly-expressed genes.
For genes with many zero-counts (we don't inflate zero counts or model dropout rates, we assume zero counts occur because of low transcript abundance), it will be much more difficult to observe a trend.
I was hoping to look at the error in the final model on a per cell basis. Is there a way to get the reconstructed gene x cell matrices, or some measurement of loss for each cell. Is this information available in one of the objects/files in the pipeline?
I am not thinking about just the topic values coding layer, I was hoping to get the output after the best model attempts to decode the compressed encoding.