Closed pkoppstein closed 4 months ago
This happens here in line 267: https://github.com/01mf02/jaq/blob/ce988d55ec1fefaf998a8b77a7c4bb97705852b2/jaq/src/main.rs#L265-L268
It works if you swap out the mmap call with a read_to_end()
call and change the signatures appropriately (diff shown below). I didn't measure the preformance downgrade though.
diff --git a/jaq/src/main.rs b/jaq/src/main.rs
index e4d1a86..33dec80 100644
--- a/jaq/src/main.rs
+++ b/jaq/src/main.rs
@@ -1,5 +1,6 @@
use clap::{Parser, ValueEnum};
use jaq_interpret::{Ctx, Filter, FilterT, ParseCtx, RcIter, Val};
+use std::io::Read;
use std::io::{self, BufRead, Write};
use std::path::PathBuf;
use std::process::{ExitCode, Termination};
@@ -262,9 +263,11 @@ fn parse(filter_str: &str, vars: Vec<String>) -> Result<Filter, Vec<ParseError>>
}
}
-fn mmap_file(path: &std::path::Path) -> io::Result<memmap2::Mmap> {
- let file = std::fs::File::open(path)?;
- unsafe { memmap2::Mmap::map(&file) }
+fn mmap_file(path: &std::path::Path) -> io::Result<Vec<u8>> {
+ let mut file = std::fs::File::open(path)?;
+ let mut contents = Vec::new();
+ file.read_to_end(&mut contents)?;
+ Ok(contents)
}
fn invalid_data(e: impl std::error::Error + Send + Sync + 'static) -> std::io::Error {
My guess is that the mmap operation depends on the file living long enough. I tried swapping out the mmap crate with fmmap, which didn't need the file-opening step, but it failed the same way.
Thanks for the bug report, @pkoppstein, and thanks for the detailed analysis, @kklingenberg!
I believe that the problem is not that the file lives long enough; it is because the file produced by <(echo 1)
is not a regular file that can be memory-mapped.
I originally used mmap
instead of read_to_end
because the latter is generally way slower. So we should not unconditionally use read_to_end
, but only when mmap
fails. That's what I did in my PR #168.
Contrast: