Closed GoogleCodeExporter closed 9 years ago
Taking a quick look at the source code, it's as I suspected... yr_scan_file()
basically mmap()'s the entire file first, and then begins to evaluate the
rules...
Specifically, first it does this:
pmapped_file->size = fstat.st_size;
pmapped_file->data = (unsigned char*) mmap(0, pmapped_file->size, PROT_READ,
MAP_PRIVATE, pmapped_file->file, 0);
And then, if that worked OK, it does this:
result = yr_scan_mem(mfile.data, mfile.size, context, callback, user_data);
yr_scan_mem_blocks() ...
[ stuff... ]
/* initialize global rules flag for all namespaces */
[ stuff... ]
/* evaluate global rules */
[ etc. ]
And then the rule actually gets checked in eval.c (or something like that)...
case TERM_TYPE_FILESIZE:
return context->file_size;
... And it's kinda too late to check at this point, the file has already been
read. (Or at least read as much as you OS Kernel is willing to buffer in
advance, which is actually quite a lot.)
Original comment by juliavi...@gmail.com
on 10 Feb 2013 at 4:05
YARA scans the file first looking for all the strings of every rule, and only
after the scanning phase has concluded, it proceeds to evaluate the rule
conditions. It would be great if YARA were smart enough to realize that in some
cases the file doesn't need to be scanned at all, but detecting those
situations is not trivial.
Original comment by plus...@gmail.com
on 23 May 2013 at 2:18
Original issue reported on code.google.com by
hrvoje.s...@gmail.com
on 21 Jan 2013 at 5:59