Open EightBitBoot opened 2 months ago
unfortunately that is actually an error I've seen before (see my note in CHANGELOG.md) and I don't know what can be done about it because it's a yara internal error. you can see where it is raised in the yara source code and you might be able to trace back something useful from there (the value that trips the error is here). you can also see that there was a change in yara 3.11 that supposed to limit these errors but IIRC that's around when i began seeing them.
there are two command line options that are passed through to the yara engine. when i was confronted with this issue i did a (very little) bit of fiddling in the hopes that might help but gave up quickly. they are --max-match-length
and --yara-stack-size
. unfortunately i have no suggestions as to what values might help other than "lower is probably better".
some questions though:
re: 2 it might be possible to incrementally delete the yara rules that come packaged with pdfalyzer from your local installation in an attempt to isolate which rule is causing the error (or whether YARA just always fails on that file regardless of rule).
if you don't know where to find the packaged YARA rules in your local installation of the pdfalyzer try running which pdfalyze
. that will probably show you a dir that ends in something like pdfalyze-[stuff]/bin/pdfalyze
. the rules files will be in a pdfalyzer/yara_rules/
dir somewhere in the sub-hierarchy (the folder hierarchy will look exactly like this repo's pdfalyzer
folder)
edit: fixed link to CHANGELOG.md
edit 2: added note about where to find YARA rules
re: 1 the best option is probably to run pdfalyze
without the -y
option which is enabled if you specify no options ("Choosing nothing is choosing everything except --streams.
")
one other thing i would say is that if you do manage to isolate a yara rule + file combination that trips the error it might be worth filing a bug in the official yara repo
(and it's definitely worth telling me what the rule is so i can at least temporarily remove it from the pdfalyzer)
I just released version 1.15.1
. It has a new command line option --no-default-yara-rules
. If you use --no-default-yara-rules
in tandem with one or more --yara-file
options the scan will be done with only your custom YARA rules file (specified by one or more --yara-file
options). Before this change specifying --yara-file
just appended the specified custom --yara-file
options to the set of prepackaged YARA rules files so there was no way to run a YARA scan without using the default rules. Now you can use only your own custom YARA rules file(s).
Theoretically this should make it much easier to debug your issue because you can select a limited set of the preconfigured rules (in this directory in the repo and copy them out to your own custom file (which you then pass to the --yara-file
argument) to test with. No which pdfalyze
/ manual editing of the files installed by pip
kind of shenanigans required any more.
If we're lucky there's just some bad rule in the pre-configured set that is causing this issue on macOS.
I am getting this error and cannot use pdfalyzer at all
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /home/lyzen/.local/share/pipx/venvs/pdfalyzer/lib64/python3.12/site-packages/pdfalyzer/output/pd │ │ falyzer_presenter.py:132 in print_yara_results │ │ │ │ 129 │ │ YaralyzerConfig.args.standalone_mode = True # TODO: using 'standalone mode' lik │ │ 130 │ │ │ │ 131 │ │ try: │ │ ❱ 132 │ │ │ self.yaralyzer.yaralyze() │ │ 133 │ │ except yara.Error as e: │ │ 134 │ │ │ console.print_exception() │ │ 135 │ │ │ print_fatal_error_panel("Internal YARA error! YARA's error codes can be chec │ │ │ │ /home/lyzen/.local/share/pipx/venvs/pdfalyzer/lib64/python3.12/site-packages/yaralyzer/yaralyzer │ │ .py:149 in yaralyze │ │ │ │ 146 │ │ │ 147 │ def yaralyze(self) -> None: │ │ 148 │ │ """Use YARA to find matches and then force decode them""" │ │ ❱ 149 │ │ console.print(self) │ │ 150 │ │ │ 151 │ def match_iterator(self) -> Iterator[Tuple[BytesMatch, BytesDecoder]]: │ │ 152 │ │ """Iterator version of yaralyze. Yields match and decode data tuple back to call │ │ │ │ /home/lyzen/.local/share/pipx/venvs/pdfalyzer/lib64/python3.12/site-packages/rich/console.py:169 │ │ 4 in print │ │ │ │ 1691 │ │ │ render = self.render │ │ 1692 │ │ │ if style is None: │ │ 1693 │ │ │ │ for renderable in renderables: │ │ ❱ 1694 │ │ │ │ │ extend(render(renderable, render_options)) │ │ 1695 │ │ │ else: │ │ 1696 │ │ │ │ for renderable in renderables: │ │ 1697 │ │ │ │ │ extend( │ │ │ │ /home/lyzen/.local/share/pipx/venvs/pdfalyzer/lib64/python3.12/site-packages/rich/console.py:132 │ │ 6 in render │ │ │ │ 1323 │ │ │ ) │ │ 1324 │ │ _Segment = Segment │ │ 1325 │ │ _options = _options.reset_height() │ │ ❱ 1326 │ │ for render_output in iter_render: │ │ 1327 │ │ │ if isinstance(render_output, _Segment): │ │ 1328 │ │ │ │ yield render_output │ │ 1329 │ │ │ else: │ │ │ │ /home/lyzen/.local/share/pipx/venvs/pdfalyzer/lib64/python3.12/site-packages/yaralyzer/yaralyzer │ │ .py:209 in rich_console │ │ │ │ 206 │ │ """Does the stuff. TODO: not the best place to put the core logic""" │ │ 207 │ │ yield bytes_hashes_table(self.bytes, self.scannable_label) │ │ 208 │ │ │ │ ❱ 209 │ │ for _bytes_match, bytes_decoder in self.match_iterator(): │ │ 210 │ │ │ for attempt in bytes_decoder.rich_console(_console, options): │ │ 211 │ │ │ │ yield attempt │ │ 212 │ │ │ │ /home/lyzen/.local/share/pipx/venvs/pdfalyzer/lib64/python3.12/site-packages/yaralyzer/yaralyzer │ │ .py:153 in match_iterator │ │ │ │ 150 │ │ │ 151 │ def match_iterator(self) -> Iterator[Tuple[BytesMatch, BytesDecoder]]: │ │ 152 │ │ """Iterator version of yaralyze. Yields match and decode data tuple back to call │ │ ❱ 153 │ │ self.rules.match(data=self.bytes, callback=self._yara_callback) │ │ 154 │ │ │ │ 155 │ │ for yara_match in self.matches: │ │ 156 │ │ │ console.print(yara_match) │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ Error: internal error: 46
this is the traceback
did you try using this approach to isolate the problem?
I'm trying to analyze a potentially malicious PDF file and consistently get the error
Internal Error: 46
. According to yara's error.h (printed alongside the error), error 46 corresponds toERROR_TOO_MANY_RE_FIBERS
. All other modules are working fine.Pdfalyzer was installed with pipx (
pipx install pdfalyzer
). The result ofpipx runpip pdfalyzer freeze
is:Please let me know if there's any other debugging info I can provide (aside from the PDF as I don't want to upload anything potentially malicious).