damian0604 / bdaca

Course Materials Big Data and Automated Content Analysis
69 stars 22 forks source link

Ngrams explanation not working #51

Closed FeLoe closed 6 years ago

FeLoe commented 6 years ago

In ic2s2.ipynb at "Using ngrams as features" the following code is used:

assert len(speeches_nl_clean)==len(speeches_nl_bigrams) speeches_nl_uniandbigrams = [] for a,b in zip([speech.split() for speech in speeches_nl_clean],speeches_nl_bigrams): speeches_nl_uniandbigrams.append(a + b)

but unigrams and bigrams do not have the same length - so the code is not working...

damian0604 commented 6 years ago

Thanks for reporting! I think it is actually correct, as both are lists of lists (and on the outer level, it's the number of speeches). Also, the code runs as expected. Or do I miss something here, @FeLoe ?

FeLoe commented 6 years ago

Yes, right... The issue was related to me only testing the code on one text instead of a list of texts, probably should have taken the time to read the part before it ;)