vineetjohn / linguistic-style-transfer

Neural network parametrized objective to disentangle and transfer style and content in text
Apache License 2.0
138 stars 33 forks source link

A Question About data_processor.populate_word_blacklist #72

Open Doragd opened 4 years ago

Doragd commented 4 years ago
def populate_word_blacklist(word_index):
    blacklisted_words = set()
    blacklisted_words |= set(global_config.predefined_word_index.values())
    if global_config.filter_sentiment_words:
        blacklisted_words |= lexicon_helper.get_sentiment_words()
    if global_config.filter_stopwords:
        blacklisted_words |= lexicon_helper.get_stopwords()
  1. The output of global_config.predefined_word_index.values() are indices of some words, not words.
  2. At this point, the actual value of this global_config.predefined_word_index is equal to word_index, not only {'<unk>': 0,'<sos>': 1,'<eos>': 2}.
  3. Therefore, I think that this blacklisted_words contains unnecessary words and does not match the meaning of the blacklist.