scVENUS / PeekabooAV-Installer

This repository provides scripts and configuration files to install, update and test a Peekaboo installation
GNU General Public License v3.0
7 stars 9 forks source link

Configure AMaViS to leave numbers, pages, keys and fringe Microsoft OOXML documents alone #64

Closed michaelweiser closed 4 years ago

michaelweiser commented 4 years ago

Some document formats identify as ZIP archives when looked at by file/magic. This includes Apple numbers, pages and keys filetypes and some Microsoft Office XML format documents which are generated by third-party tools (seen with a PDF conversion tool). This causes AMaViS to unpack them and hand the individual content snippets as samples to Peekaboo which prevents us from doing a proper analysis. There is no direct way to prevent this. There is an option keep_decoded_original_maps which would apply to all ZIP archives and do an additional analysis on all of them as well.

But as with all of AMaViS, we can hook some functions creatively to introduce a blacklist. There we can also decide to prevent unpacking altogether or tell AMaViS to do the unpacking but additionally hand the original, unmodified file to the virus scanner.

Example code for e.g. 50-peekaboo.conf:

# do not unpack certain files even though by their filetype they are archives
my $non_decompose_filename_re = new_RE(
  qr'\.(numbers|key|pages)$',
);

sub do_7zip_filtered($$$;$) {
  my($part, $tempdir, $archiver, $testing_for_sfx) = @_;

  if (defined($part->name_declared) &&
      Amavis::Lookup::lookup(0, $part->name_declared,
                             $non_decompose_filename_re)) {
    Amavis::Util::do_log(4, "filter_decompose_parts: not unpacking %s (%s)",
                         $part->base_name, $part->name_declared);
    # report part being atomic
    return 0;
  }

  return Amavis::Unpackers::do_7zip($part, $tempdir, $archiver,
                                    $testing_for_sfx);
}

unshift @decoders, (
  ['zip', \&do_7zip_filtered, ['7za', '7z']],
);

Untested code for keeping of original:

# do unpack but keep certain files even though by their filetype they are archives
my $keep_original_filename_re = new_RE(
  qr'\.docx$',
);

sub do_7zip_filtered($$$;$) {
  my($part, $tempdir, $archiver, $testing_for_sfx) = @_;

  if (defined($part->name_declared) &&
      Amavis::Lookup::lookup(0, $part->name_declared,
                             $keep_original_filename_re)) {
    Amavis::Util::do_log(4, "filter_decompose_parts: unpacking but keeping %s (%s)",
                         $part->base_name, $part->name_declared);
    # report part as original needing to be kept
    Amavis::Unpackers::do_7zip($part, $tempdir, $archiver,
                                      $testing_for_sfx);
    return 2;
  }

  return Amavis::Unpackers::do_7zip($part, $tempdir, $archiver,
                                    $testing_for_sfx);
}

unshift @decoders, (
  ['zip', \&do_7zip_filtered, ['7za', '7z']],
);

Needs some testing and a bit of thinking-cap scratching, how to integrate and which of the options to choose for analysis.