CybercentreCanada / assemblyline

AssemblyLine 4: File triage and malware analysis
https://cybercentrecanada.github.io/assemblyline4_docs/
MIT License
235 stars 14 forks source link

FrankenStrings URL extraction seems to trim URLs on char 0, even when it's not a binary file #229

Closed kam193 closed 3 months ago

kam193 commented 4 months ago

Describe the bug I've just observed that FrankenStrings started to trim URLs when they have a 0 inside. It looks like an attempt to fix extraction from binaries where URLs often weren't properly trimmed, but applied to text files results in too early trimming.

To Reproduce Steps to reproduce the behavior:

  1. Upload a file with an URL like https://some.doma.in/pathwith0butnotended, a real example: utils.py.zip (pass: zippy, it contains a Discord webhook)
  2. Wait on results.
  3. See the URL extracted as https://some.doma.in/pathwith

Expected behavior The full URL is extracted.

Screenshots obraz

Environment (please complete the following information if pertinent):

Additional context

cccs-jh commented 4 months ago

You are exactly right, this is a false positive from a recently implemented pascal string check. I'll change it so it only runs on data files.

cccs-rs commented 3 months ago

This should be resolved in version 1.3.5 of the multidecoder package and in the latest stable release of Frankenstrings.