x-tabdeveloping / turftopic

Robust and fast topic models with sentence-transformers.
https://x-tabdeveloping.github.io/turftopic/
MIT License
8 stars 3 forks source link

[WIP] suggestions for minor fixes #20

Closed rbroc closed 4 months ago

rbroc commented 4 months ago

heya, I've started exploring hoping to be able to contribute a bit to development. this is just a kitchen sink PR with minor suggestions as I get a feel for how it works. I'll request review and add couple more suggestions once I am done.

x-tabdeveloping commented 4 months ago

The fact that I don't import TopicData from topicwizard was intentional. I don't want people to have to install a bunch of dependencies that topicwizard has, as they might not need any of it. In my view the less coupled any two codebases are, the better. Besides, TopicData only exists at the leve of the static type checker and does not actually add any behaviour to the package ((also intentional)). The coupling argument also goes for the batching function, though I don't see why it would be problematic to put it in utils (it should be fine) I am generally in favour of repeating ourselves instead of introducing dependencies or abstractions for practical and ideological reasons (we can debate some of this though :D)

rbroc commented 4 months ago

i am on the opposite side of the ideological spectrum, but you're the boss -- i've reverted changes on both. Concerning topicwizard, one of the things I was actually going to suggest is to move plotting utils for dynamic topic models there for consistency with other models, but if you want to keep them independent also fine to leave them there.

x-tabdeveloping commented 4 months ago

That's a reasonable suggestion, let's talk about it sometime. I'm not sure the way I currently do it is the optimal one, as I basically came up with it in a rush. We should also talk about the code repetition thing sometime, I hope/think I might be able to convince you.

x-tabdeveloping commented 4 months ago

Also @rbroc I would not add new commits that revert your changes, just make a new PR, where you git cherry-pick or git rebase -i to select the commits that are actually important from this PR.

rbroc commented 4 months ago

awesome, will do!

rbroc commented 4 months ago

closing this as we agreed to not require topicwizard and i am willing to concede on keeping abstraction minimal ;) the remaining relevant changes from this PR are all docstring-related -- will open a new PR with those