stephenslab / fastTopics

Fast algorithms for fitting topic models and non-negative matrix factorizations to count data.
https://stephenslab.github.io/fastTopics
Other
74 stars 7 forks source link

de_analysis using LDA Topic Modeling #36

Closed gomeznick86 closed 1 year ago

gomeznick86 commented 1 year ago

Hey @pcarbo!

I'm really interested in using the de_analysis to analyze our data. However, we've already processed the data and found topics using LDA outside of your package. I have a version of your n x K matrix fit$L . But I think I need more in order to start the differential expression. Do I need to first run init_poisson_nmf? If so, I think I'm confused on what the values of F should be.

Sorry if I missed this somewhere.

pcarbo commented 1 year ago

@gomeznick86 Thanks for your interest in fastTopics. I should improve the de_analysis interface so that it also accepts an L matrix.

In the meantime, you could do something like this:

# X is an n x m matrix, L is already defiined.
Fdummy <- matrix(0,m,k)
rownames(Fdummy) <- gene_names
colnames(Fdummy) <- colnames(L)
fit <- list(L = L,F = Fdummy)
class(fit) <- c("multinom_topic_model_fit","list")
de <- de_analysisi(fit,...)

It doesn't actually use the F matrix so I call it "Fdummy".

This should work, but if not let me know.

pcarbo commented 1 year ago

Also, if you can leave this Issue open, it will be a reminder for me to make this improvement to de_analysis.

gomeznick86 commented 1 year ago

Thanks for the quick response! I really appreciate the workaround. It looks likes it is working. Thanks and I'll leave it open.

MarcElosua commented 1 year ago

hi @pcarbo,

Thank you so much for such a great package! I had the same question as @gomeznick86

Thank you for your workaround and looking forward to the update!

gomeznick86 commented 1 year ago

Hey @pcarbo. Just a heads up if you do end up implementing this option. I learned the hard way that the rownames of X and rownames of L need to match so maybe having a quick identical() check would be useful. Thanks again for this package and the neat partial membership trick!

pcarbo commented 1 year ago

@gomeznick86 @MarcElosua I've implemented the feature you suggested so that de_analysis can be provided with a topic proportions matrix L instead of a Poisson NMF/topic model object.

Also, @gomeznick86, I added checks for dimnames being consistent.

I'll close this issue, but please open again if you notice any issues with this new feature.