boudinfl / pke

Python Keyphrase Extraction module
GNU General Public License v3.0
1.55k stars 291 forks source link

ValueError: The number of observations cannot be determined on an empty distance matrix. #74

Closed hoonkai closed 5 years ago

hoonkai commented 5 years ago

Hi

I'm trying to run python examples/keyphrase-extraction.py but I keep getting an error saying

  File "examples/keyphrase-extraction.py", line 25, in <module>
    method='average')
  File "/Volumes/Data/Projects/pke/src/pke/pke/unsupervised/graph_based/topicrank.py", line 203, in candidate_weighting
    self.topic_clustering(threshold=threshold, method=method)
  File "/Volumes/Data/Projects/pke/src/pke/pke/unsupervised/graph_based/topicrank.py", line 155, in topic_clustering
    Z = linkage(Y, method=method)
  File "/Volumes/Data/Projects/pke/lib/python3.7/site-packages/scipy/cluster/hierarchy.py", line 1112, in linkage
    n = int(distance.num_obs_y(y))
  File "/Volumes/Data/Projects/pke/lib/python3.7/site-packages/scipy/spatial/distance.py", line 2384, in num_obs_y
    raise ValueError("The number of observations cannot be determined on "
ValueError: The number of observations cannot be determined on an empty distance matrix.

Anyone know why this is the case?

Thanks

haskarb commented 5 years ago

Were you able to resolve this issue?

drmuskangarg commented 5 years ago

I am unable to solve this issue. Can someone help?

boudinfl commented 5 years ago

From the error message, it seems that TopicRank clustering algorithm fails because there is no keyphrase candidates. Do you have the C-1.xml file and run the example from the same directory ? maybe cd examples/ then python keyphrase-extraction.pyshould do the trick.