Closed rise0chen closed 1 year ago
It panicks at error "Invalid checksum" which could be that you zip file is corrupted, this is not zip itself panics.
And you don't need to create files = vec![zip.clone(); zip.len()]
which is inefficient, you can just do:
let mut files: Vec<_> = (0..zip.len())
.map(|i| zip_file.by_index(i).unwrap())
.collect();
use std::fs;
use std::io::{Read, Result as IoResult, Write};
use zip::{ZipArchive, ZipWriter};
const PATH: &'static str = "./test.zip";
fn main() -> IoResult<()> {
create_zip()?;
let zip_file = fs::File::open(PATH)?;
let mut zip = ZipArchive::new(&zip_file)?;
{
let mut file = zip.by_index(0).unwrap();
let mut s = String::new();
file.read_to_string(&mut s).unwrap();
println!("{}", s);
}
let mut files = vec![zip.clone(); zip.len()];
let mut files: Vec<_> = files
.iter_mut()
.enumerate()
.map(|(i, file)| file.by_index(i).unwrap())
.collect();
files.iter_mut().for_each(|x| {
let mut s = String::new();
x.read_to_string(&mut s).unwrap();
println!("{}", s);
});
Ok(())
}
fn create_zip() -> IoResult<()> {
let zip_file = fs::File::create(PATH)?;
let mut zip = ZipWriter::new(zip_file);
zip.start_file("1", Default::default())?;
zip.write_all("data1".as_bytes())?;
zip.start_file("1", Default::default())?;
zip.write_all("data1-1".as_bytes())?;
zip.start_file("2", Default::default())?;
zip.write_all("data2".as_bytes())?;
zip.start_file("3", Default::default())?;
zip.write_all("data3".as_bytes())?;
zip.finish()?;
Ok(())
}
zip file is created by ZipWriter, and it is not corrupted.
It need &mut self. mutable reference can only borrow one time. So, I need clone ZipArchive.
Oh you are right.
BTW I think the title is still a bit misleading.
The zip code itself does not panic while extracting, it merely returns an error.
The panic is caused by the unwrap
you use.
IMO it will be better to say that use it later gives "Invalid checksum" error.
I don't think it should return "Invalid checksum".
use std::fs;
use std::io::{Read, Result as IoResult, Write};
use zip::{ZipArchive, ZipWriter};
const PATH: &'static str = "./test.zip";
fn main() -> IoResult<()> {
create_zip()?;
let zip_file = fs::File::open(PATH)?;
let mut zip = ZipArchive::new(&zip_file)?;
{
let mut file1 = zip.by_index(0).unwrap();
let mut s = String::new();
file1.read_to_string(&mut s).unwrap();
println!("{}", s);
}
{
let mut file2 = zip.by_index(1).unwrap();
let mut s = String::new();
file2.read_to_string(&mut s).unwrap();
println!("{}", s);
}
{
let mut zip1 = zip.clone();
let mut zip2 = zip.clone();
let mut file1 = zip1.by_index(0).unwrap();
let mut file2 = zip2.by_index(1).unwrap();
let mut s = String::new();
file1.read_to_string(&mut s).unwrap();
println!("{}", s);
let mut s = String::new();
file2.read_to_string(&mut s).unwrap();
println!("{}", s);
}
Ok(())
}
fn create_zip() -> IoResult<()> {
let zip_file = fs::File::create(PATH)?;
let mut zip = ZipWriter::new(zip_file);
zip.start_file("1", Default::default())?;
zip.write_all("data1".as_bytes())?;
zip.start_file("1", Default::default())?;
zip.write_all("data1-1".as_bytes())?;
zip.start_file("2", Default::default())?;
zip.write_all("data2".as_bytes())?;
zip.start_file("3", Default::default())?;
zip.write_all("data3".as_bytes())?;
zip.finish()?;
Ok(())
}
It can work when use ZipFile one by one, and it return "Invalid checksum" when muti ZipFile
I don't think it should return "Invalid checksum".
I'm not saying it should, I'm suggesting a better title would be mention "invalid checksum" instead of panic.
It can work when use ZipFile one by one, and it return "Invalid checksum" when muti ZipFile
This might have to do with this line:
let mut zip = ZipArchive::new(&zip_file)?;
You are sharing one std::fs::File
in multiple ZipArchive
.
Each ZipArchive
do some reads and seeks, which changes the internal cursor of std::fs::File
maintained by the OS.
The way to fix this error is to create multiple independent std::fs::File
by either opening multiple times, or using File::try_clone
, or you can use the std::os::unix::fs::FileExt
interface and create a new type like this:
struct FileRef<'a>(&'a std::fs::File, u64);
and manually implements Read
and Seek
on it, but that will be unix-only.
Is the internal cursor changed when call by_index
? Can we change the internal cursor before call read
?
Is the internal cursor changed when call by_index?
Yes, it certainly is changed.
by_index
uses seek
to move the cursor then read
from it.
Can we change the internal cursor before call read?
ZipArchiver
assumes that the reader you passed in does not get changed by others.
So to fix this, you either creates multiple std::fs::File
or use unix-only std::os::unix::fs::FileExt.
Another option is to use mmarinus to mmap the file and obtain a &[u8]
reference, then wrap that in std::io::Cursor
and passed it to ZipArchive
.
This would also reduce the I/O.
creates multiple std::fs::File
also need creates multiple ZipArchive
.
Maybe, we should remove Clone
of ZipArchive
.
creates multiple std::fs::File also need creates multiple ZipArchive.
You got it in the wrong order:
creates multiple ZipArchive
also need multiple std::fs::File
.
The Clone
implementation of ZipArchive
is indeed a bit confusing when combined with the fact that &std::fs::File
also implements Read
and Seek
, removing Clone
implementation would indeed makes it less confusing.
@rise0chen While I'm not the maintainer of this crate, you can open a PR and I believe @zamazan4ik can review this since it will be pretty simple and straight-forward.
let mut zip1 = zip.clone();
let mut zip2 = zip.clone();
let mut file1 = zip1.by_index(0).unwrap();
let mut file2 = zip2.by_index(1).unwrap();
(&zip_file).seek(SeekFrom::Start(file1.data_start()))?;
let mut s = String::new();
file1.read_to_string(&mut s).unwrap();
println!("{}", s);
(&zip_file).seek(SeekFrom::Start(file2.data_start()))?;
let mut s = String::new();
file2.read_to_string(&mut s).unwrap();
println!("{}", s);
It work, if I change the internal cursor before call read.
Maybe, it's a bug of standard library.
It shouldn't impl Read
for &File
Maybe, it's a bug of standard library. It shouldn't impl
Read
for&File
That could be considered a bug indeed, maybe you can raise that in zulip?
This is part of a much wider discussion in Rust of IO safety, but the gist of it is that yes, you need to be aware of file handles being fiddled with from under you. Similar issues will happen if another process edits a file while you're reading from it.
ZipArchive will keep the Clone implementation. Internal mutability is unfortunately a hazard that you need to keep in mind.
panicked at
x.read_to_string(&mut s).unwrap();