earlng / academic-pdf-scrap

Code that scraps the contents of the PDF papers submitted for NeurIPS 2020
MIT License
4 stars 2 forks source link

Double Space #15

Closed earlng closed 3 years ago

earlng commented 3 years ago

Describe the bug There's also a lot of discrepancies which are just due to double spaces " " vs " " which I suppose is due to the code adding a space in between text chunks, even when there already is one.

Additional context But that's an easy fix with a replace and find or through the TRIM() function