Closed srhrshr closed 9 years ago
Thanks, @SreeHarshaRamesh
No problem at all. Thanks @charlieg for a great resource which as a beginner, I could follow it along quite comfortably.
Technically, we didn't need to use sent_tokenize(), but if we only used word_tokenize() alone, we'd see a bunch of extraneous sentence-final punctuation in our output.
Could you give an example as to what this punctuation that sent_tokenize() does not capture is, because, I found no difference whatsoever between the two on comparison
Please correct the line
pip install readability- lxml
topip install readability-lxml