Closed FeLoe closed 6 years ago
Thanks for reporting! I think it is actually correct, as both are lists of lists (and on the outer level, it's the number of speeches). Also, the code runs as expected. Or do I miss something here, @FeLoe ?
Yes, right... The issue was related to me only testing the code on one text instead of a list of texts, probably should have taken the time to read the part before it ;)
In ic2s2.ipynb at "Using ngrams as features" the following code is used:
assert len(speeches_nl_clean)==len(speeches_nl_bigrams) speeches_nl_uniandbigrams = [] for a,b in zip([speech.split() for speech in speeches_nl_clean],speeches_nl_bigrams): speeches_nl_uniandbigrams.append(a + b)
but unigrams and bigrams do not have the same length - so the code is not working...