abrignoni / iLEAPP

iOS Logs, Events, And Plist Parser
MIT License
713 stars 141 forks source link

Artifact Info Structure Update #568

Open JamesHabben opened 11 months ago

JamesHabben commented 11 months ago

As I have learned more about the structure of this project and how the modules are being developed, I think another update to the artifact structure might help to better organize and document information about the modules and all the individual artifacts they parse. I noticed that most of these modules are 1:1 in the items they add to the report, but there are a few that add multiple items to the report.

Example: Viber Module

In the code, it produces a potential of 4 items in the report based on the existence of certain data, but the artifact structure of the script doesn't give any indication of that since it has only 1 function entry of get_viber in the dictionary. https://github.com/abrignoni/iLEAPP/blob/ceff9fb87c7f4b675c989eeb74c0dbda250fbbcd/scripts/artifacts/viber.py

Artifact v2 of Viber Module

__artifacts_v2__ = {
    "viber": {
        "name": "Viber Artifacts",
        "description": "Get Viber settings, contacts, recent calls and messages information. This script queries "
                       "Settings.data and Contacts.data Viber dbs and creates a report of findings including KML "
                       "geolocation data. Settings hold the user's personal data and configurations. Contacts hold "
                       "contacts, calls, messages and more.",
        "author": "Evangelos Dragonas (@theAtropos4n6)",
        "version": "0.0.2",
        "date": "2022-03-15",
        "requirements": "",
        "category": "Viber",
        "notes": "The code is divided into 4 queries-artifacts blocks. The 1st parses settings db, extracts and "
                 "reports on user's available information regarding Viber configuration. The 2nd parses contacts db, "
                 "extracts and reports on user's contacts. Be advised that a contact may not participate in a chat ("
                 "therefore a contact is not a chat 'member') and vice versa. A chat 'member' may not be registered as "
                 "a Viber contact. The 3rd parses contacts db, extracts and reports on user's "
                 "recent calls that have no corresponding message (ZVIBERMESSAGE) entry, indicating these messages "
                 "have been deleted. The 4th parses contacts db, extracts and reports on user's chats, including extra "
                 "columns with each chat's grouped participants and phone numbers. More information is stored within "
                 "the above databases, and this artifact assists in parsing the most out of it. ",
        "paths": (
            '**/com.viber/settings/Settings.data',
            '**/com.viber/database/Contacts.data',
            '**/Containers/Data/Application/*/Documents/Attachments/*.*',
            '**/com.viber/ViberIcons/*.*'
        ),
        "function": "get_viber"
    }
}

Artifacts

Search for ArtifactHtmlReport( and you find 4 hits for the 4 items it adds to the report, but no other way to programmatically determine that in current form.

Update

With an updated artifact structure ( maybe __artifact_v3__ ?), the structure could take on a 1 module to many artifact relationship. What I have in mind would require a little restructuring of the code that runs the artifacts though as the framework currently allows for an author to add as many items to the report within an artifact as they feel. Rather, if the report addition in tied to the artifact function and the artifact function is only allowed one instance of a report object, it forces compliance to the information structure.

Proposed Updated Structure

__artifacts_v3__ = {
    "module_name": "Viber Artifacts",
    "description": "Get Viber settings, contacts, recent calls and messages information. This script queries "
                   "Settings.data and Contacts.data Viber dbs and creates a report of findings including KML "
                   "geolocation data. Settings hold the user's personal data and configurations. Contacts hold "
                   "contacts, calls, messages and more.",
    "author": "Evangelos Dragonas (@theAtropos4n6)",
    "version": "0.0.2",
    "date": "2022-03-15",
    "requirements": "",
    "app_name": "Viber",
    "category": "Viber",
    "category_icon": "message-square",
    "notes": "The code is divided into 4 queries-artifacts blocks. The 1st parses settings db, extracts and "
             "reports on user's available information regarding Viber configuration. The 2nd parses contacts db, "
             "extracts and reports on user's contacts. Be advised that a contact may not participate in a chat ("
             "therefore a contact is not a chat 'member') and vice versa. A chat 'member' may not be registered as "
             "a Viber contact. The 3rd parses contacts db, extracts and reports on user's "
             "recent calls that have no corresponding message (ZVIBERMESSAGE) entry, indicating these messages "
             "have been deleted. The 4th parses contacts db, extracts and reports on user's chats, including extra "
             "columns with each chat's grouped participants and phone numbers. More information is stored within "
             "the above databases, and this artifact assists in parsing the most out of it. ",
    "paths": (
        '**/com.viber/settings/Settings.data',
        '**/com.viber/database/Contacts.data',
        '**/Containers/Data/Application/*/Documents/Attachments/*.*',
        '**/com.viber/ViberIcons/*.*'
    ),
    "artifacts": {
        { 
            "artifact_name": "Viber - Settings", # could directly tie to report name
            "function": "get_viber_settings",
            "report_name": "Viber Settings Report", # if you prefer to have this separate from artifact name
            "artifact_icon": "git-commit", # feather icons name
            "report_notes": "Settings pulled from xyz.sqlite file", # to be displayed at the top of the page
            "report_warning": "be careful with the timestamp of this artifact..."
        },
        {
            "artifact_name": "Viber - Contacts",
            "function": "get_viber_contacts",
            "report_name": "Viber Contacts Report",
            "artifact_icon": "user",
            "report_notes": "Settings pulled from xyz.sqlite file"
        },
        {
            "artifact_name": "Viber - Call Remnants",
            "function": "get_viber_calls",
            "report_name": "Viber Calls Report",
            "artifact_icon": "phone-call",
            "report_notes": "Settings pulled from xyz.sqlite file"
        },
        {
            "artifact_name": "Viber - Chats",
            "function": "get_viber_chats",
            "report_name": "Viber Chats Report",
            "artifact_icon": "message-square",
            "report_notes": "Settings pulled from xyz.sqlite file"
        }

    }
}

Module Code Updates

With this updated structure, the module calling function would change slightly. Rather than passing in the report folder path, the calling code can automatically create the report object with the already provided name and pass that object into the artifact function to let it add more to the report.

Artifact Function Call

existing: def get_viber(files_found, report_folder, seeker, wrap_text, timezone_offset):

updated: def get_viber(files_found, artifact_report_section, seeker, wrap_text, timezone_offset):

Artifact Report Lines

existing:

        report = ArtifactHtmlReport('Viber - Settings')
        report.start_artifact_report(report_folder, 'Viber - Settings')
        report.add_script()
        data_headers = ('Setting','Value')
        report.write_artifact_data_table(data_headers, data_list, file_found, html_escape=False)
        report.end_artifact_report()

updated:

        report.add_script() # could remove this need with a default
        data_headers = ('Setting','Value')
        report.write_artifact_data_table(data_headers, data_list, file_found, html_escape=False)

Thoughts?

I haven't gain a completely thorough understanding of the LEAPP framework yet, so I don't know the full impact of a change like this. I think it would make creating modules quite a bit less intimidating for folks.

Thoughts?

JamesHabben commented 11 months ago

I've been in javascript too much lately. I suppose the pythonic naming convention is artifact_name rather than artifactName

abrignoni commented 11 months ago

Your proposal makes absolute sense, it is efficient and well thought out.

My main concern is people power. Any changes will have to be ported to ALEAPP, VLEAPP, & RLEAPP as well. Currently I am making all artifacts in the LEAPPs timezone aware and it is taking/will take an insane amount of time (months.)

Do you mind giving it a look and gauge the level of difficulty to implement? I'm all for it but if implementing requires a lot of refactoring then I propose we look into implementing it after the timezone offset thing is done.

JamesHabben commented 11 months ago

Ya, I can put some time on it. Is the architecture between these version similar enough to be able to do something like a pull request across repos?

JamesHabben commented 11 months ago

a quick check says its not quite in line.

iLEAPP: 135 lines https://github.com/abrignoni/iLEAPP/blob/main/scripts/artifact_report.py

aLEAPP: 361 lines https://github.com/abrignoni/ALEAPP/blob/main/scripts/artifact_report.py

JamesHabben commented 11 months ago

Which tool is your primary? Looks like aLEAPP report code above has additional image, timeline, chat functions that iLEAPP doesn't.

abrignoni commented 11 months ago

All have timeline, KML, & tsv functions.

Some of the differences between the projects is due to android vs ios quirks and the stuff folks contribute to one project but not used on another.

On Tue, Oct 17, 2023, 1:25 PM James Habben @.***> wrote:

Which tool is your primary? Looks like aLEAPP report code above has additional image, timeline, chat functions that iLEAPP doesn't.

— Reply to this email directly, view it on GitHub https://github.com/abrignoni/iLEAPP/issues/568#issuecomment-1766857717, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3DPC66J7DDMC27QWKB3QLX725RZAVCNFSM6AAAAAA6C26332VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWHA2TONZRG4 . You are receiving this because you commented.Message ID: @.***>

JamesHabben commented 11 months ago

updated to python naming and added report-icon as an option. with that we can reduce (maybe eliminate) the need for that big icon list object in reports.py file by letting plugins set it right here. could be useful in the gui when displaying available modules, although i havent explored the gui much yet.

abrignoni commented 11 months ago

As stuff is merged please send some PRs to RLEAPP & VLEAPP as well so they can benefit too. There is no way I can go back and try to port it over myself.

JamesHabben commented 11 months ago

ya i am working my way towards that before getting this new artifact structure in anywhere. need to set a baseline first.

abrignoni commented 11 months ago

Please add yourself to all the LEAPPs in the developer section. thank you so much for this. It is such a leapp forward. I'll walk myself out...

JamesHabben commented 11 months ago

another update to the structure above.

JamesHabben commented 11 months ago

added report_warning to the artifact structure. this can be used to automatically display (if it exists) in a colored panel to provide caution to the examiner about the interpretation of any of the data below.

JamesHabben commented 10 months ago

i think we could benefit from having a field that declares if this module is parsing an artifact from a filesystem dump or itunes backup. can certainly better inform users of what they can expect to extract using the module, but it could help in improving processing speed in allowing the script to skip a module or artifact of module if its not processing a data source that it can even extract data from. unsure if this would be better applied at the module or report artifact level.

abrignoni commented 10 months ago

The tooling was originally designed for full file system parsing. We added iTunes backup as an option for folks to develop specific artifacts for these. Since the backups have so little useful data in comparison to a full file system I have received little to none backup specific artifacts.

I think it is a good idea but I'm not sure how much benefit will it provide in the sense that almost all use cases involve full file systems and not backups.

As a mater of fact many of the artifacts that find stuff in a backup do so accidentally since the report comes from an artifact originally designed for a full file system. For example some of these FFS artifacts would get more hits on a backup if the paths were fully qualified instead of using the backup domains.

Long story short backups haven't been our focus per user population tool usage but I am happy to merge anything that helps any user case as long as the overhead is manageable and does not impacts creating an artifact. It has to be a really necessary change (like making all timestamps datetime objects for your html solution to work) in order to require an artifact creator to abide by a particular rule beyond the current ones.

Hope that makes sense.

On Sat, Nov 4, 2023, 6:03 PM James Habben @.***> wrote:

i think we could benefit from having a field that declares if this module is parsing an artifact from a filesystem dump or itunes backup. can certainly better inform users of what they can expect to extract using the module, but it could help in improving processing speed in allowing the script to skip a module or artifact of module if its not processing a data source that it can even extract data from. unsure if this would be better applied at the module or report artifact level.

— Reply to this email directly, view it on GitHub https://github.com/abrignoni/iLEAPP/issues/568#issuecomment-1793565503, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3DPCYAYOPQCWBI4YEB2DDYC23RVAVCNFSM6AAAAAA6C26332VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJTGU3DKNJQGM . You are receiving this because you commented.Message ID: @.***>

JamesHabben commented 10 months ago

yup. understood and agree. on the civilian side of this world, we deal almost exclusively with backups or non-FFS extractions. my testing with iLEAPP so far has been with a few icloud or itunes backups and i'm happy to see that many of the modules are finding data to process. i haven't fully thought out what adding a 'data source' type of field in the info structure would look like, but it would certainly help potential users (selfishly, me lol) to know if a module can parse data from a backup or if a FFS is required.