Jayman391 / lnlp

MIT License
0 stars 0 forks source link

ValueError: zero-size array to reduction operation maximum which has no identity #18

Open Jayman391 opened 5 months ago

Jayman391 commented 5 months ago

(nllp) MacBook-Air-3:lnlp user$ python main.py tests/test_data/data.csv

Welcome to the NLLP CLI! Loaded data from tests/test_data/data.csv

  1. Run a Topic Model
  2. Run an Optimization routine for a Topic Model (GPU reccomended)
  3. Run a Classification Model
  4. Load Global Configuration Files
  5. Exit Choose an option: 1 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}]}
  6. Select LLM to generate Embeddings
  7. Select Dimensionality Reduction Technique
  8. Select Clustering Technique
  9. Fine Tuning
  10. Plotting
  11. Save Session Configuration
  12. Run Topic Model
  13. Load Session Configuration
  14. Back
  15. Exit Choose an option: 1 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}]}
  16. all-MiniLM-L6-v2
  17. all-MiniLM-L12-v2
  18. multi-qa-MiniLM-L6-cos-v1
  19. all-mpnet-base-v2
  20. Muennighoff/SGPT-125M-weightedmean-msmarco-specb-bitfit
  21. Muennighoff/SGPT-125M-weightedmean-nli-bitfit
  22. Muennighoff/SGPT-1.3B-weightedmean-msmarco-specb-bitfit
  23. Add huggingface Model
  24. Back
  25. Exit Choose an option: 6 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}]}
  26. Select LLM to generate Embeddings
  27. Select Dimensionality Reduction Technique
  28. Select Clustering Technique
  29. Fine Tuning
  30. Plotting
  31. Save Session Configuration
  32. Run Topic Model
  33. Load Session Configuration
  34. Back
  35. Exit Choose an option: 2 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}]}
  36. UMAP
  37. PCA
  38. t-SNE
  39. Truncated SVD
  40. Factor Analysis
  41. Back
  42. Exit Choose an option: 2 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}]}
  43. Select LLM to generate Embeddings
  44. Select Dimensionality Reduction Technique
  45. Select Clustering Technique
  46. Fine Tuning
  47. Plotting
  48. Save Session Configuration
  49. Run Topic Model
  50. Load Session Configuration
  51. Back
  52. Exit Choose an option: 3 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}]}
  53. hdbscan
  54. kmeans
  55. spectral clustering
  56. dbscan
  57. agglomerative clustering
  58. birch
  59. affinity propagation
  60. mean shift
  61. Back
  62. Exit Choose an option: 3 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}]}
  63. Select LLM to generate Embeddings
  64. Select Dimensionality Reduction Technique
  65. Select Clustering Technique
  66. Fine Tuning
  67. Plotting
  68. Save Session Configuration
  69. Run Topic Model
  70. Load Session Configuration
  71. Back
  72. Exit Choose an option: 4 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}, {'Topic': 'Fine Tuning'}]}
  73. Enable 2-grams
  74. Enable 3-grams
  75. Ignore Words
  76. Enable BM25 weighting
  77. Reduce frequent words
  78. Enable KeyBERT algorithm
  79. Enable ZeroShotClassification
  80. Enable Maximal Marginal Relevance
  81. Enable Part of Speech filtering
  82. Back
  83. Exit Choose an option: 1 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}, {'Topic': 'Fine Tuning'}, {'Fine Tuning': 'Enable 2-grams'}]}
  84. Select LLM to generate Embeddings
  85. Select Dimensionality Reduction Technique
  86. Select Clustering Technique
  87. Fine Tuning
  88. Plotting
  89. Save Session Configuration
  90. Run Topic Model
  91. Load Session Configuration
  92. Back
  93. Exit Choose an option: 5 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}, {'Topic': 'Fine Tuning'}, {'Fine Tuning': 'Enable 2-grams'}, {'Topic': 'Plotting'}]}
  94. Enable Topic Visualizations
  95. Enable Document Visualizations
  96. Enable Term Visualizations
  97. Enable All Visualizations
  98. Specify Plot Directory
  99. Specify Web Browser
  100. Back
  101. Exit Choose an option: 4 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}, {'Topic': 'Fine Tuning'}, {'Fine Tuning': 'Enable 2-grams'}, {'Topic': 'Plotting'}, {'Plotting': 'Enable All Visualizations'}]}
  102. Select LLM to generate Embeddings
  103. Select Dimensionality Reduction Technique
  104. Select Clustering Technique
  105. Fine Tuning
  106. Plotting
  107. Save Session Configuration
  108. Run Topic Model
  109. Load Session Configuration
  110. Back
  111. Exit Choose an option: 5 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}, {'Topic': 'Fine Tuning'}, {'Fine Tuning': 'Enable 2-grams'}, {'Topic': 'Plotting'}, {'Plotting': 'Enable All Visualizations'}, {'Topic': 'Plotting'}]}
  112. Enable Topic Visualizations
  113. Enable Document Visualizations
  114. Enable Term Visualizations
  115. Enable All Visualizations
  116. Specify Plot Directory
  117. Specify Web Browser
  118. Back
  119. Exit Choose an option: 5 Enter the plot directory: tests directory : tests {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}, {'Topic': 'Fine Tuning'}, {'Fine Tuning': 'Enable 2-grams'}, {'Topic': 'Plotting'}, {'Plotting': 'Enable All Visualizations'}, {'Topic': 'Plotting'}, {'Plotting': 'tests'}]}
  120. Select LLM to generate Embeddings
  121. Select Dimensionality Reduction Technique
  122. Select Clustering Technique
  123. Fine Tuning
  124. Plotting
  125. Save Session Configuration
  126. Run Topic Model
  127. Load Session Configuration
  128. Back
  129. Exit Choose an option: 7 {'errors': [], 'info': ['Initialized Global Session Object and Global Driver', 'Initialized Landing Menu', 'Initialized Embeddings Menu', 'Initialized Dimensionality Reduction Menu', 'Initialized Cluster Menu', 'Initialized Fine Tuning Menu', 'Initialized Plotting Menu', 'Initialized ConfigMenu Menu', 'Initialized Topic Menu', 'Initialized ConfigMenu Menu'], 'data': [{'Landing': 'Topic'}, {'Topic': 'Embeddings'}, {'Embeddings': 'Muennighoff/SGPT-125M-weightedmean-nli-bitfit'}, {'Topic': 'Dimensionality Reduction'}, {'Dimensionality Reduction': 'PCA'}, {'Topic': 'Cluster'}, {'Cluster': 'spectral clustering'}, {'Topic': 'Fine Tuning'}, {'Fine Tuning': 'Enable 2-grams'}, {'Topic': 'Plotting'}, {'Plotting': 'Enable All Visualizations'}, {'Topic': 'Plotting'}, {'Plotting': 'tests'}, {'Topic': 'BERTopic(calculate_probabilities=False, ctfidf_model=ClassTfidfTransformer(...), embedding_model=SentenceTransformer(...), hdbscan_model=HDBSCAN(...), language=None, low_memory=False, min_topic_size=10, n_gram_range=(1, 1), nr_topics=None, representation_model=None, seed_topic_list=None, top_n_words=10, umap_model=PCA(...), vectorizer_model=CountVectorizer(...), verbose=False, zeroshot_min_similarity=0.7, zeroshot_topic_list=None)'}]} Number of boolean topics: 0 [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3] [{'Plotting': 'Enable All Visualizations'}, {'Plotting': 'tests'}] An error occurred. Please try again. Would you like to see the error trace? (y/n): y Traceback (most recent call last): File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 46, in run self._process_responses(self.landing, self.driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 64, in _process_responses self._process_responses(response, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 64, in _process_responses self._process_responses(response, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 68, in _process_responses self._process_responses(menu.parent, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 64, in _process_responses self._process_responses(response, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 68, in _process_responses self._process_responses(menu.parent, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 64, in _process_responses self._process_responses(response, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 68, in _process_responses self._process_responses(menu.parent, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 64, in _process_responses self._process_responses(response, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 68, in _process_responses self._process_responses(menu.parent, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 64, in _process_responses self._process_responses(response, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 68, in _process_responses self._process_responses(menu.parent, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 64, in _process_responses self._process_responses(response, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 68, in _process_responses self._process_responses(menu.parent, driver) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/_lnlpcli.py", line 66, in _process_responses driver.run_topic_model() File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/drivers/_driver.py", line 106, in run_topic_model self._visualize_topics(model, topics, dir) File "/Users/user/Desktop/Spring_2024/Research/lnlp/src/drivers/_driver.py", line 119, in _visualize_topics topic_viz = model.visualize_topics() File "/Users/user/anaconda3/envs/nllp/lib/python3.9/site-packages/bertopic-0.16.0-py3.9.egg/bertopic/_bertopic.py", line 2249, in visualize_topics File "/Users/user/anaconda3/envs/nllp/lib/python3.9/site-packages/bertopic-0.16.0-py3.9.egg/bertopic/plotting/_topics.py", line 79, in visualize_topics File "/Users/user/anaconda3/envs/nllp/lib/python3.9/site-packages/umaplearn-0.5.5-py3.9.egg/umap/umap.py", line 2887, in fit_transform self.fit(X, y, force_all_finite) File "/Users/user/anaconda3/envs/nllp/lib/python3.9/site-packages/umaplearn-0.5.5-py3.9.egg/umap/umap.py", line 2780, in fit self.embedding_, aux_data = self._fit_embed_data( File "/Users/user/anaconda3/envs/nllp/lib/python3.9/site-packages/umaplearn-0.5.5-py3.9.egg/umap/umap.py", line 2826, in _fit_embed_data return simplicial_set_embedding( File "/Users/user/anaconda3/envs/nllp/lib/python3.9/site-packages/umaplearn-0.5.5-py3.9.egg/umap/umap.py", line 1086, in simplicial_set_embedding graph.data[graph.data < (graph.data.max() / float(n_epochs_max))] = 0.0 File "/Users/user/anaconda3/envs/nllp/lib/python3.9/site-packages/numpy/core/_methods.py", line 41, in _amax return umr_maximum(a, axis, None, out, keepdims, initial, where) ValueError: zero-size array to reduction operation maximum which has no identity
Jayman391 commented 5 months ago

pretty sure this has the ability to happen when num features < num data points (So at basically any point?) To fix I would either -Re embed data and try again -pca the embeddings and then do UMAP