koheiw / seededlda

LDA for semisupervised topic modeling
https://koheiw.github.io/seededlda/
73 stars 15 forks source link

FYI: A review of this package #75

Open chainsawriot opened 4 months ago

chainsawriot commented 4 months ago

There is a tidbit about this package in this now widely shared blog post.

As you can see, it is under the "implicit type conversions". But after "a reader of this blog" (you can guess who he is) pointed out the factual error in the accusation, it is not about implicit type conversions at all. Instead, it is about whether batch_size should be a proportion and according to the writer, 3 persons didn't understand what does batch_size mean.

I think the documentation has explained clearly what the batch_size does and in my opinion, it makes a lot of sense for it to be a proportion. But I am afraid some people might think that this parameter works similarly to gensim's chunksize.

koheiw commented 4 months ago

Thanks. I am happy that my package attracted people's attention. Do you think this document help the users?

chainsawriot commented 4 months ago

@koheiw Yes, this document would definitely be helpful.

koheiw commented 4 months ago

It is linked from the README of this repo as "working paper" but I was not really working on it... I should update it to answer questions like this or that. Where do you think I should publish it?

chainsawriot commented 4 months ago

Well, for this working paper, I would say arxiv/socarxiv would be nice. For the subsequent actual paper, I am afraid I am not an expert. But you can try like Computational Communication Research's Tool Announcements.

koheiw commented 4 months ago

Thanks! I never thought about the Tool Announcements.