Closed ianrandman closed 4 months ago
Thank you for your work on this! I left a couple of small comments. Other than that, can you run
ruff
? With a PR that was recently merged, we now useruff
for the formatting/linting.
Thanks for pointing out the recent incorporation with ruff
. I'll have to start using that in my own projects.
I have fixed my changes with ruff
, fixed the merge confict, and resolved a couple of your comments. Please mark them resolved if my changes look good. There is only the remaining discussion about probabilities.
It looks like this is still failing because I think you only ran one of the two Ruff commands.
Ruff has format
, which replaces e.g. Black and check
which replaces e.g. flake8. If you run ruff check --fix
it should autofix the remaining issues and let you know which need manual fixing. You can also run make format
and make lint
as shortcuts.
I believe the code check in python 3.8 failed because its not familiar with the tuple[..., ....]
type hints. I believe replacing them with either from typing import Tuple
should work or simply removing those type hints.
@ianrandman Awesome, everything passed and I think we addressed all the comments we had. Just to be 100% sure, shall I go ahead and merge this?
@ianrandman Awesome, everything passed and I think we addressed all the comments we had. Just to be 100% sure, shall I go ahead and merge this?
Yes, all good to merge if it looks good to you. Happy to be done with this :).
@ianrandman Awesome, thank you for taking the time the last couple of works to work on this. It is greatly appreciated and hopefully this will also make it easier for you to use BERTopic instead of your own fork. If there are any other changes you would like to see, please let me know!
type(self.hdbscan_model) != BaseCluster
when checking whether model is zero-shotBaseCluster
duringfit_transform()
during zero-shot topic modeling.self._outliers
rather than tracking it to maintain alignment using@property
@property
topic_to
,topics_from
for mappingreduce_outliers()
zeroshot_min_similarity
. Otherwise, the calculated representation is used.Fixes #1967