But, the parser returned an empty list. Do you have any idea what could be the issue here? I suspected two column formatting might be a problem, since the initial couple of papers I used were two-column format, and the parser failed. But now, its failing with the single column format (Neurips - the same venue as the demo paper) as well. Maybe using pdfplumber could be a better option?
The filtered text has weird text length going to more than 15k ?
The data frame created is 64,3. Though the text length seems spurious. Some issue with the pdf parser maybe?
Hi
This is really an awesome project, and could become a very handy open tool using @keerthanpg's colab! Anyways, I have been fiddling with the scripts but I found that not all kinds of papers can be parsed yet. I tried some of the recent NeurIPS papers: https://proceedings.neurips.cc/paper/2021/file/007ff380ee5ac49ffc34442f5c2a2b86-Paper.pdf, https://proceedings.neurips.cc/paper/2021/file/003dd617c12d444ff9c80f717c3fa982-Paper.pdf
But, the parser returned an empty list. Do you have any idea what could be the issue here? I suspected two column formatting might be a problem, since the initial couple of papers I used were two-column format, and the parser failed. But now, its failing with the single column format (Neurips - the same venue as the demo paper) as well. Maybe using pdfplumber could be a better option?