Topics can already be described in the models by using the print_topics() method. The problem is that this does not output the table in a reusable format. If some user wants to topic descriptions as a format that can be reused and imported in other types of software, this is impractical.
We should provide some easy method to achieve this in Turftopic.
Implementation:
I have a couple of approaches up my sleeve for this:
We can create a new method, which returns the topic descriptions in a pd.DataFrame. This would make it easy for data scientists to work with, but would introduce pandas as a dependency, which I would love to avoid.
We can add a parameter to print_topics(), that specifies the format in which it should be printed,
e.g. model.print_topics(format="csv"). This would allow users to pipe stdout into a csv file. This would not introduce dependencies, but would be a bit less intuitive.
We could add a method that returns a string model.export_topics(format="csv") and then users are able to print this or write it out to a file.
Rationale:
Topics can already be described in the models by using the
print_topics()
method. The problem is that this does not output the table in a reusable format. If some user wants to topic descriptions as a format that can be reused and imported in other types of software, this is impractical. We should provide some easy method to achieve this in Turftopic.Implementation:
I have a couple of approaches up my sleeve for this:
pd.DataFrame
. This would make it easy for data scientists to work with, but would introducepandas
as a dependency, which I would love to avoid.print_topics()
, that specifies the format in which it should be printed, e.g.model.print_topics(format="csv")
. This would allow users to pipe stdout into a csv file. This would not introduce dependencies, but would be a bit less intuitive.model.export_topics(format="csv")
and then users are able to print this or write it out to a file.