kevoreilly / CAPEv2

Malware Configuration And Payload Extraction
https://capesandbox.com/analysis/
Other
1.77k stars 392 forks source link

False Positives in Safe PDF File Analysis #2086

Closed XxCloudMindXx closed 1 month ago

XxCloudMindXx commented 2 months ago

Expected Behavior:

Current Behavior:

Steps to Reproduce:

Submit a safe PDF file for analysis.
Observe the generated alerts and false positives.

Additional Information:

Keyword Count
obj 329
endobj 329
stream 99
endstream 91
xref 1
trailer 1
startxref 1
/Page 38
/Encrypt 0
/ObjStm 0
/JS 0
/JavaScript 0
/AA 0
/OpenAction 0
/AcroForm 0
/JBIG2Decode 0
/RichMedia 0
/Launch 0
/EmbeddedFile 0
/XFA 0
/Colors > 2^24 0
doomedraven commented 2 months ago

Those are community signatures, you are welcome to improve them

El vie, 26 abr 2024, 12:07, XxCloudMindXx @.***> escribió:

Expected Behavior:

  • The system should accurately analyze safe PDF files without triggering false positive alerts, score should not be 10/10.

Current Behavior:

  • The system is showing numerous false positives during the analysis of safe PDF files. These false positives include:

Signatures:

Possible Heap Spray Exploit Detection: Time: 2024-04-17 09:01:09 Caller: 0x7793553c API: NtAllocateVirtualMemory Arguments: ProcessHandle: 0xffffffffffffffff, BaseAddress: 0x00dd1000, RegionSize: 0x00001000, Protection: PAGE_READWRITE Status: Success Return: 0x00000000

[Additional similar instances...]

Collects and Encrypts Information: Time: 2024-04-17 09:01:36 Caller: 0x008e7de7 API: CryptHashData Arguments: CryptHash: 0x014ed7d8, Buffer: [Encrypted Data] Status: Success Return: 0x00000001

[Additional similar instances...]

Attempted Loading of File with Unusual Extension as DLL: Time: 2024-04-17 09:01:14 Caller: 0x0050005c API: LdrLoadDll Arguments: Flags: 0x00000000, FileName: C:\program files (x86)\Adobe\Reader 9.0\Reader\RdLang32.FRA, BaseAddress: 0x00000000 Status: Success Return: 0x00000000

[Additional similar instances...]

CAPE Extracted Potentially Suspicious Content: Suspicious Content: AcroRd32_exe: embedded_pe

Display of Potential Decoy Document to User: Decoy Document: "c:\program files (x86)\adobe\reader 9.0\reader\acrord32.exe" "c:\users\admin\appdata\local\temp\safe.pdf" Time: 2024-04-17 09:01:09 Caller: 0x77941e7e API: NtDelayExecution Arguments: Milliseconds: 30, Status: Skipped

Creation of Hidden or System Files: Files: C:\ProgramData\Adobe\Reader\9.2\ARM\BITAF7C.tmp, C:\ProgramData\Adobe\Reader\9.2\ARM\BITE84F.tmp [Details of NtCreateFile calls...]

Access to Credential Storage Registry Keys: Registry Key: HKEY_LOCAL_MACHINE\System

System Fingerprinting Information Collection: Registry Key: HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Internet Explorer\Registration\ProductId

Yara Detections: Yara Rule: embedded_pe Process ID: 4272

Steps to Reproduce:

Submit a safe PDF file for analysis. Observe the generated alerts and false positives.

Additional Information:

  • Operating System: Win 10 Pro
  • CAPEV2 Version: [17.11.2023]
  • Adobe Reader : 9.0
  • File Type PDF document, version 1.3
  • PDF Information:

Keyword Count obj 329 endobj 329 stream 99 endstream 91 xref 1 trailer 1 startxref 1 /Page 38 /Encrypt 0 /ObjStm 0 /JS 0 /JavaScript 0 /AA 0 /OpenAction 0 /AcroForm 0 /JBIG2Decode 0 /RichMedia 0 /Launch 0 /EmbeddedFile 0 /XFA 0 /Colors > 2^24 0

— Reply to this email directly, view it on GitHub https://github.com/kevoreilly/CAPEv2/issues/2086, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOFH37BPMAQ26HIQM4FWXLY7IRPLAVCNFSM6AAAAABG2PYYUOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3DKNBUGUYDINI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

kevoreilly commented 2 months ago

Which score are you referring to? If it's malscore, this is a legacy cuckoo feature which is not enabled in CAPE by default for exactly this reason.

The difficulty is in how to avoid scoring actions like these but still catch malicious actions that use the same or similar API. I would be interested to hear any proposal to solve this problem - failing that as I mentioned this is exactly why malscore is not enabled by default. My advice would be to disable it.

wasbt commented 2 months ago

@kevoreilly I suggest to fix [Accessed credential storage registry keys] while analysing pdf using adobe reader is to reduce the severity using

class RegistryCredentialStoreAccess(Signature):
    name = "registry_credential_store_access"
    description = "Accessed credential storage registry keys"
    severity = 3
    categories = ["persistence", "lateral", "credential_dumping"]
    authors = ["Kevin Ross"]
    minimum = "1.3"
    evented = True
    ttps = ["T1003"]  # MITRE v6,7,8
    ttps += ["T1003.002"]  # MITRE v7,8
    mbcs = ["OB0005"]

    def run(self):
        ret = False
        reg_indicators = [
            "HKEY_LOCAL_MACHINE\\\\SAM$",
            "HKEY_LOCAL_MACHINE\\\\SYSTEM$",
        ]

        for indicator in reg_indicators:
            match = self.check_key(pattern=indicator, regex=True)
            if match:
                self.data.append({"regkey": match})
                ret = True
        # Tweak
        if "PDF" in self.results["target"]["file"].get("type", ""):
            self.severity = 1
        return ret

And also i created a new pdf annot url checker because the https://github.com/CAPESandbox/community/blob/master/modules/signatures/all/pdf_annot_urls.py not working anymore


from lib.cuckoo.common.abstracts import Signature

class PDF_Annot_URLs_Checker(Signature):
    name = "pdf_annot_urls_checker"
    description = "The PDF contains a Link Annotation"
    severity = 2  # Default severity
    categories = ["static"]
    authors = ["Wassime BATTA"]
    minimum = "0.5"

    filter_analysistypes = set(["file","static"])

    malicious_tlds_file = "/opt/CAPEv2/data/malicioustlds.txt"

    def __init__(self, *args, **kwargs):
        super(PDF_Annot_URLs_Checker, self).__init__(*args, **kwargs)
        self.malicious_tlds = self.load_malicious_tlds()

    def load_malicious_tlds(self):
        malicious_tlds = set()
        with open(self.malicious_tlds_file, "r") as f:
            for line in f:
                line = line.strip()
                if line.startswith("."):
                    malicious_tlds.add(line)
        return malicious_tlds

    def run(self):
        found_malicious_extension = False
        found_malicious_domain = False
        found_domain_only = False
        suspect = False

        if "PDF" in self.results["target"]["file"].get("type", ""):
            if "Annot_URLs" in self.results["target"]["file"]["pdf"]:
                for entry in self.results["target"]["file"]["pdf"]["Annot_URLs"]:
                    entry_lower = entry.lower()
                    self.data.append({"url": entry})
                    if entry_lower.endswith((".exe", ".php", ".bat", ".cmd", ".js", ".jse", ".vbs", ".vbe", ".ps1", ".psm1", ".sh")) \
                            and not entry_lower.startswith("mailto:"):
                        found_malicious_extension = True

                    if entry_lower.startswith("http://") or entry_lower.startswith("https://"):
                        domain_start = entry_lower.find("//") + 2
                        domain_end = entry_lower.find("/", domain_start)
                        if domain_end == -1:
                            domain = entry_lower[domain_start:]
                        else:
                            domain = entry_lower[domain_start:domain_end]

                        for malicious_tld in self.malicious_tlds:
                            if domain.endswith(malicious_tld):
                                found_malicious_domain = True
                                break
                        else:
                            # If no malicious TLDs detected, set found_domain_only to True
                            found_domain_only = True

            if found_malicious_domain or found_malicious_extension:
                self.severity = 6
                self.description = "The PDF contains a Malicious Link Annotation"
                suspect = True
            elif found_domain_only:
                self.severity = 2
                self.description = "The PDF contains a Link Annotation"
                suspect = True

        return suspect

And a malicious/suspect tld in /opt/CAPEv2/data/malicioustlds.txt

.link
.cam
.bar
.surf
.xyz
.click
.buzz
.gq
.ga
.rest
.ml
.cc
.cfd
.cyou
.accountant
.ar
.bg
.bid
.biz
.biz.ua
.br
.camera
.cf
.club
.co
.co.ua
.co.in
.co.mz
.co.nz
.com.au
.com.tw
.computer
.cricket
.date
.diet
.download
.email
.es
.faith
.gdn
.global
.guru
.help
.in
.info
.kz
.lol
.loan
.media
.men
.news
.ninja
.nyc
.party
.photography
.pt
.pw
.racing
.reise
.review
.rocks
.ru
.science
.site
.solutions
.space
.stream
.tech
.today
.top
.tr
.trade
.uno
.us
.vn
.webcam
.website
.win
.work
.africa
.autos
.best
.bet
.bio
.boats
.bond
.boston
.boutique
.center
.charity
.christmas
.coupons
.dance
.finance
.fishing
.giving
.hair
.haus
.homes
.icu
.kim
.lat
.llp
.loans
.love
.ltd
.mom
.motorcycles
.name
.okinawa
.promo
.rehab
.rugby
.run
.sale
.sew
.skin
.store
.sz
.tattoo
.tokyo
.voto
.wang
.wf
.yachts
.you

The new script is working well with dyanmic and static scan (tested in Win10 and adobe reader 9) image

wasbt commented 2 months ago

Sometimes CAPE suricata Alert trigger "AKAMAI-AS" as malicious severity 3, i suggest also to add commenting #alert http $HOME_NET any -> $EXTERNAL_NET any (msg:"ET USER_AGENTS Microsoft Device Metadata Retrieval Client User-Agent"; flow:established,to_server; http.user_agent; content:"MICROSOFT_DEVICE_METADATA_RETRIEVAL_CLIENT"; depth:42; endswith; nocase; fast_pattern; classtype:misc-activity; sid:2027390; rev:4; metadata:affected_product Web_Browsers, attack_target Client_Endpoint, created_at 2019_05_28, deployment Perimeter, former_category USER_AGENTS, performance_impact Low, signature_severity Informational, updated_at 2020_09_17;) in suricata.rules will disable this false positive, in docs

kevoreilly commented 1 month ago

@wasbt thank you for your suggestions - I have created a PR with these changes as I am very keen to welcome contributions. We would however appreciate PRs in future as it saves unnecessary effort.

https://github.com/CAPESandbox/community/pull/430