dotnet / machinelearning

ML.NET is an open source and cross-platform machine learning framework for .NET.
https://dot.net/ml
MIT License
8.99k stars 1.88k forks source link

Lda bag of words export model #3092

Open IvanAntipov opened 5 years ago

IvanAntipov commented 5 years ago

I use LDA transformation from example

var pipeline = 
  ml.Transforms.Text.ProduceWordBags(review).
     Append(ml.Transforms.Text.LatentDirichletAllocation(review, ldaFeatures, numberOfTopics: 3));

var transformer = pipeline.Fit(trainData);
var transformed_data = transformer.Transform(trainData);

Now i try to visualize data with pyLDAvis

For this task i need phi matrix, theta matrix, vocabulary, term_frequency.

It is possible to get theta matrix using documents transform, It is possible to get phi matrix using, with SingleBox.GetModel internal method (using Reflection) It din't managed to get the vocabulary.

For a moment it is to hard to export LDA related parameters from ml pipline.

If would be nice to be able to export complete set of related LDA parameters

najeeb-kazmi commented 4 years ago

Related to #4322. Keeping this issue open since the ask is of broader scope than just getting list of relevant words for each topic.