cistrome / MIRA

Python package for analysis of multiomic single cell RNA-seq and ATAC-seq.
56 stars 8 forks source link

Finding reconstructed gene by cell from different models #12

Closed Connorr0 closed 1 year ago

Connorr0 commented 1 year ago

I was hoping to look at the error in the final model on a per cell basis. Is there a way to get the reconstructed gene x cell matrices, or some measurement of loss for each cell. Is this information available in one of the objects/files in the pipeline?

I am not thinking about just the topic values coding layer, I was hoping to get the output after the best model attempts to decode the compressed encoding.

AllenWLynch commented 1 year ago

Hi Connor,

Thanks for checking out MIRA! Just to clarify, from a trained topic model, you want to find the reconstruction loss on a per cell basis?

Allen


From: Connor Finkbeiner @.> Sent: Friday, October 14, 2022 6:44 PM To: cistrome/MIRA @.> Cc: Subscribed @.***> Subject: [cistrome/MIRA] Finding reconstructed gene by cell from different models (Issue #12)

I was hoping to look at the error in the final model on a per cell basis. I was hoping to get the reconstructed gene x cell matrices, or some measurement of loss for each cell. Is this information available in one of the objects/files in the pipeline?

To be clear, I am not thinking about just the topic values coding layer, I was hoping to get the output after the best model attempts to decode the compressed encoding.

— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F12&data=05%7C01%7C%7Cc49084b767104f81487608daae3e0f6f%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638013878807074321%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=cUwUYt105QxCikND7zlqf3Ud8%2Bfd09B4J4ld%2F8%2FCKaU%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPAU3S32BD7KBLAWS7TWDHV6PANCNFSM6AAAAAARFU75SI&data=05%7C01%7C%7Cc49084b767104f81487608daae3e0f6f%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638013878807074321%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0Jko8AERDrjw6XnnQ21V7wV1eMasOoeBgnONy%2B5K9dE%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Connorr0 commented 1 year ago

Yes, I want to look across cell types to see if the reconstruction was worse for any of my cell types.

Also I'm excited to play around with MIRA!

AllenWLynch commented 1 year ago

Hi Connor,

Unfortunately, the current API does not support seeing per cell losses. I am working on a substantial update and will consider adding this feature.

Allen


From: Connor Finkbeiner @.> Sent: Monday, October 17, 2022 12:03 PM To: cistrome/MIRA @.> Cc: AllenWLynch @.>; Comment @.> Subject: Re: [cistrome/MIRA] Finding reconstructed gene by cell from different models (Issue #12)

Yes, I wanted to look across cell types to see if the encoding was worse for any of my cell types.

— Reply to this email directly, view it on GitHubhttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fcistrome%2FMIRA%2Fissues%2F12%23issuecomment-1281178798&data=05%7C01%7C%7Cec75cfca0a6f41f36ff308dab061851e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638016230131299586%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LvzHE7BI0jphadzUkfE2Hfrx4%2BKknWbV7jg88RRwRXA%3D&reserved=0, or unsubscribehttps://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAE43JPFVRF546TRGV5HFFDLWDWBGHANCNFSM6AAAAAARFU75SI&data=05%7C01%7C%7Cec75cfca0a6f41f36ff308dab061851e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638016230131299586%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7eRK4dILPYqveF3Ow48RxCNaW%2BDrGHVBOhHpUO6oHgE%3D&reserved=0. You are receiving this because you commented.Message ID: @.***>

Connorr0 commented 1 year ago

Thanks for the response!

Connorr0 commented 1 year ago

Also, if I wanted to get an estimate of the reconstruction accuracy across cells would anndata[counts] vs model.impute() be meaningful (once I column normalized counts)?

AllenWLynch commented 1 year ago

It may be informative if you normalize counts, but likely only for very highly-expressed genes.

For genes with many zero-counts (we don't inflate zero counts or model dropout rates, we assume zero counts occur because of low transcript abundance), it will be much more difficult to observe a trend.