Open zthatch opened 2 years ago
Hard agree with this!
https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.extractall
Perhaps use of tmp directory for extraction instead of default current working directory in order to manage cleanup more effectively?
This implementation decodes the zip file line by line rather than extracting it in its entirety and then reading it as a csv. This is a significant bottleneck in the code and can be improved by using extractall to extract the zip file to a temporary directory, and then use the csv reader to iterate through the rows, which then requires changes to the line processing the use of stringIO (rather than bytesIO) to load the table into a dataframe.