Closed dgleich closed 2 months ago
Thanks for raising the issue and sharing the code. I think indeed just using zipped shapefiles is becoming more common with other software like GDAL supporting it directly. One alternative approach I can think of is using https://github.com/JuliaIO/TranscodingStreams.jl, where users can supply the decompressor from CodecZlib. That way we avoid the JLL dependency while still making it easier to load from a compressed file (not just zipfiles).
Though since zipfiles are the most common and the JLL dependency is small perhaps just directly depending on CodecZlib is also reasonable.
So the simplest thing might be to setup Shapefile.jl to allow it to take in any object with an iterator over file IOs where each file has a .name entry. E.g. so you could call...
shp = Shapefile.Table(ZipFile.Reader("myfile.zip").files)
the ".files" object is really a Vector of IOs. So the generic input could be Vector{T} where T <: IO (but this doesn't always give a way to list filenames... hmm...)
This would avoid any dependencies, and still make it pretty easy to use.
It sounds like something similar might exist at some point for Tar files too.
@dgleich if you ever wanted to PR this change it would be useful.
This could also be implemented as an extension, with a nice error message saying that you have to load ZipFiles.jl for this to work correctly!
Solved by #113
Many Shapefiles are distributed directly as zip files.
The routine (below) shows how it is possible to read them directly from the zip file without decompressing it on disk. I used this to read all 3000 zip files from the us road database.
This seems like it might be a useful feature to add to the library. If that's something that might be of interest, let me know as there would be a few different ways this could be integrated into the library.