kenzeng24 / social-network-url-clustering

MIT License
3 stars 3 forks source link

interaction.py file's cluster_tm_analysis doesn't run due to tfidf_vectorize not being defined? #2

Open achen004 opened 2 years ago

achen004 commented 2 years ago

import numpy as np import pandas as pd import matplotlib.pyplot as plt import sys sys.path.append('/content/social-network-url-clustering/src') from data_loading.interactions import get_metadata, aggregate_text, cluster_tm_analysis from preprocessing.vectorize_text import tfidf_vectorize, topic_generator

Error after using cluster_tm_analysis: topics_list=cluster_tm_analysis(CLUSTER_FILE, METADATA_FILE, 1, 10)

NameError Traceback (most recent call last) in ----> 1 topics_list=cluster_tm_analysis(CLUSTER_FILE , METADATA_FILE, 1, 10)

/content/social-network-url-clustering/src/data_loading/interactions.py in cluster_tm_analysis(cluster_json, filename, ngram, num_topics, i_min, n_len) 65 filtered_data = subset[subset.agg_text.apply(len)>n_len] 66 ---> 67 tfidf, features = tfidf_vectorize(filtered_data.agg_text, ngram=ngram) #change ngrams here? predefined vocab can be adjusted 68 69 outputs=topic_generator(tfidf, features, num_topics=num_topics)

NameError: name 'tfidf_vectorize' is not defined

kenzeng24 commented 2 years ago

you never imported tfidf_vectorize in interactions. I would put cluster_tm_analysis in vectorize_text