onekey-sec / unblob

Extract files from any kind of container formats
https://unblob.org
Other
2.14k stars 81 forks source link

fix(handler): fix UBI PEB size calculation. #706

Closed qkaiser closed 7 months ago

qkaiser commented 7 months ago

Computing the most frequent interval can only work if we return each and every interval value observed in a list. Returning it in a set hides the repeating interval patterns.

The get_intervals function now actually acts like numpy.diff.

qkaiser commented 7 months ago

To provide more context, here's how get_intervals is used by the UBI handler:

def _guess_peb_size(self, file: File) -> int:
        # Since we don't know the PEB size, we need to guess it. At the moment we just find the
        # most common interval between every erase block header we find in the image. This _might_
        # cause an issue if we had a blob containing multiple UBI images, with different PEB sizes.
        all_ubi_eraseblock_offsets = list(iterate_patterns(file, self._UBI_EC_HEADER))

        offset_intervals = get_intervals(all_ubi_eraseblock_offsets)
        if not offset_intervals:
            raise InvalidInputFormat

        return statistics.mode(offset_intervals)