Closed yg4886 closed 9 years ago
Awesome. Thanks! I'll get the codes into some of those.
Thanks.
On Nov 14, 2014, at 10:48 AM, James Howison notifications@github.com wrote:
Awesome. Thanks! I'll get the codes into some of those.
— Reply to this email directly or view it on GitHub https://github.com/jameshowison/softcite/pull/5#issuecomment-63093243.
I have added the pdftotext output. The txt_original folder is the original output without any processing. For the txt_sentences folder, I split the text into sentences roughly using NLTK tokenizor.