metakgp / chillzone

Find a place to chill during class hours in IIT KGP
https://chill.metakgp.org
GNU General Public License v3.0
22 stars 27 forks source link

Support for Python > 3.8 #39

Open harshkhandeparkar opened 1 year ago

harshkhandeparkar commented 1 year ago

The PDF library used to read timetables, camelot-py, only supports Python versions 3.6, 3.7, and 3.8. Support for Python 3.10+ would be mandatory in a year since 3.8 will stop receiving security updates in Oct 2024.

Possible solutions:

anuraganand92 commented 1 year ago

I'm working with to use tabula-py as the alternate library to fix this and pandas to export as excel

harshkhandeparkar commented 1 year ago

@shikharish

shikharish commented 1 year ago

@anuraganand92 Go ahead! And do share your progress. How is tabula working on the pdf?

anuraganand92 commented 12 months ago

I discarded tabula as it wasn't that good or fast enough for parsing, I tried pdfplumber which was similar to camelot. I attempted it on test.pdf, but i am not sure if the parsing format in test.xls is the correct one, because some cells have multiple entries or different arrangement of entries test.xlsx

shikharish commented 12 months ago

Yes, I myself tried tabula, pdfplumber and a few others. None of them were as good as camelot. If we can't find an alternative, forking and updating camelot seems like the only option.