mainlp / semantic_components

Finding semantic components in your neural representations.
MIT License
1 stars 0 forks source link

Error when the number of components is zero #1

Open zhiyintan opened 1 day ago

zhiyintan commented 1 day ago

When length of self.component_vectors = 0 (hdbscan_min_cluster_size/hdbscan_min_samples is set too large)

IndexError: list index out of range

https://github.com/mainlp/semantic_components/blob/e5c1c585ed5e0921b9effb2fa042c7d754a334aa/semantic_components/decomposition.py#L511

eichinflo commented 14 hours ago

Thank you for bringing this to our attention! In your opinion: What would be the expected behavior if a representation of a non-existent component is requested here? Thinking about it, throwing a more expressive error message is probably the correct behavior.

What I am also concerned about is the naming of the function as "representation" here refers to the cluster centroid and not a c-tf-idf representation. I think I'll rename this and also add a function to the SCA class that offers the same functionality of exposing the cluster centroids.