Closed PassMark closed 3 years ago
@PassMark thx for the suggestions, note that version 20171104 of this project is experimental and has not undergone performance optimization.
Might also be related to this issue. #74 as the poor caching will result in libewf_handle_read_buffer_at_offset being called too often.
Is this speculation or did you observe this behavior in your tests?
Relation to #74 is speculation.
Investigation went like this,
Here is the profiling output (release mode code).
Hopefully we didn't break anything else.
Turns out the new hash algorithm was good for split files, but not so good for single monolithic E01 files. So there was a performance regression.
We revised it again. This time using the 64 bit to 32 bit Hash Function here https://gist.github.com/badboy/6267743
New version looks like this,
int libewf_segment_file_calculate_cache_entry_index(
int element_index,
int element_file_index,
off64_t element_offset,
size64_t element_size,
uint32_t element_flags,
int number_of_cache_entries)
{
uint64_t key = ((uint64_t)element_index << 32) | (uint32_t)element_file_index;
key = (~key) + (key << 18); // key = (key << 18) - key - 1;
key = key ^ (key >> 31);
key = key * 21; // key = (key + (key << 2)) + (key << 4);
key = key ^ (key >> 11);
key = key + (key << 6);
key = key ^ (key >> 22);
return (int)(key % number_of_cache_entries);
}
This seems to restore the single file performance to what it was and also fix the multi-file performance issue.
Various optimization have been made in the mean time I'll do some speed comparison tests with RAW and libewf legacy
Cache now MRU based and initial comparison of random-access reads in https://github.com/libyal/libewf/commit/3b4894e9b6a49d737d3e913ddc91267343f1be8d indicate comparable speeds with libewf-legacy. More performance testing will be done at a later stage. Closing issue.
We noticed slow performance on split E01 images in release 20171104
For example listing 50,000 files names in a E01 took 153 seconds. Where as doing the same with the original physical drive took a few seconds.
After a couple of days of tracing the problem, code profiling & making some test cases, we concluded that ,
Making these changes gave a 20x performance improvement in some test cases. 153 seconds reduced to 7 seconds.
Might also be related to this issue. https://github.com/libyal/libewf/issues/74
as the poor caching will result in libewf_handle_read_buffer_at_offset being called too often.