google / timesketch

Collaborative forensic timeline analysis
Apache License 2.0
2.58k stars 589 forks source link

Plaso upload fails if Timesketch's L2T version is older than 6 months #2895

Closed bit4n6 closed 8 months ago

bit4n6 commented 1 year ago

TL;DR - If you need a fix now and don't want to follow all this haggling over this being a bug or not:


Last week, I had to do some analysis on an older case (not too old - from July/August) and analyze some triage collections that were created at that time, but not used so far. So I started the VM for that case again (no config changes whatsoever) and - the upload failed. I tired other collections that had already been imported successfully in July/August - but they now failed as well.

After quite some trying, testing and debugging, I found out that pinfo.py is the culprit: When triggered by tasks.py/run_plaso, it throws the following warning (which decode.py cannot handle, resulting in a crash):

  [WARNING] (MainProcess) PID:14237 <tools> This version of plaso is more than 6 months old.
  WARNING: the version of plaso you are using is more than 6 months old. We strongly recommend to update it. 

The L2T version installed at that time was 20230311, i.e. the warning is probably generated on any run of pinfo.py on or after Sep 11 (9/11 - yet again...)

Note: This is not a case of L2T versions being out of sync between creation of plaso file and upload of the resulting plaso file to Timesketch or of using a very old plaso version in general (I've seen other issues on this). The plaso file has been created with the exact same L2T version as used during upload and still the upload fails

To Reproduce Steps to reproduce the behavior:

  1. Install Log2Timeline version 20230311 in your Timesketch environment
  2. Create a plaso file from some image or triage collection using this L2T version
  3. Try to upload it to Timesketch (either via Web GUI or from CLI using timesketch_importer)

Expected behavior Make the parser reading the pinfo.py output more robust by just ignoring any warnings generated by pinfo.py.

For sake of isolation and reproducibility, I usually clone separate VMs for each case and I do not want to change any configs in the VM while I'm working on that case. In this case, it's especially annoying, since with L2T version 20230717, the fields _parser_chain and _event_values_hash have been removed from the OpenSearch output, i.e. after uploading the latest collections with the new L2T version, I cannot use the same queries across all timelines anymore.

I think it's important to have some leeway regarding the L2T version - as long as the same version is used both for parsing and ingesting to Timesketch - to avoid being forced to switch L2T versions in the middle of a case.

Screenshots No screenshots (the timeline "Info" pop-up of the Web GUI unfortunately does not provide any Error Details)

Content from worker.log:

    [2023-09-15 12:53:19,262] celery.worker.strategy/INFO Task timesketch.lib.tasks.run_plaso[e0e8d913dc4449489b892fab98387254] received
    [2023-09-15 12:53:19,269] timesketch.tasks/INFO Index timeline [base-admin-triage20230914t153402z-triage] to index [44c185b3afe842d5a5bc7ae66602d17e] (source: plaso)
    [2023-09-15 12:53:20,028] timesketch.tasks/ERROR Error: Expecting value: line 1 column 1 (char 0)
    Traceback (most recent call last):
      File "/usr/local/lib/python3.10/dist-packages/timesketch/lib/tasks.py", line 649, in run_plaso
        storage_counters = json.loads(re.sub(r"^{, ", r"{", storage_counters_json))
      File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
        return _default_decoder.decode(s)
      File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
        raise JSONDecodeError("Expecting value", s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Console output of timesketch_import:

    $ timesketch_importer -u admin -p admin --host http://127.0.0.1 --timeline_name <import-name> --sketch_id 1 </path/to/import.plaso> 
    [2023-09-15 11:10:39,753] timesketch_importer.importer_frontend/INFO Using cached credentials.
    [2023-09-15 11:11:16,383] timesketch_importer.importer_frontend/INFO Creating a client.
    [2023-09-15 11:11:31,802] timesketch_importer.importer_frontend/INFO Client created.
    [2023-09-15 11:11:33,419] timesketch_importer.importer_frontend/INFO Saving TS config.
    [2023-09-15 11:13:00,555] timesketch_importer.importer_frontend/INFO Uploading file.
    [2023-09-15 11:13:49,050] timesketch_importer.importer_frontend/INFO About to upload file.
    [2023-09-15 11:14:08,891] timesketch_importer.importer_frontend/INFO File upload completed.
    Checking file upload status: [FAIL]

Desktop (please complete the following information):

joachimmetz commented 1 year ago

[WARNING] (MainProcess) PID:14237 This version of plaso is more than 6 months old.

As the log line clearly indicates this is just a warning and not preventing you from reading the Plaso storage file

I think it's important to have some leeway regarding the L2T version - as long as the same version is used both for parsing and ingesting to Timesketch - to avoid being forced to switch L2T versions in the middle of a case.

You should be able to use the older version of Plaso, there are Dockerized versions available that include all dependencies.

the fields _parser_chain and _event_values_hash have been removed from the OpenSearch output, i.e. after uploading the latest collections with the new L2T version, I cannot use the same queries across all timelines anymore.

these are Plaso internal only values, can you describe why you want to query on these?

bit4n6 commented 1 year ago

[WARNING] (MainProcess) PID:14237 This version of plaso is more than 6 months old.

As the log line clearly indicates this is just a warning and not preventing you from reading the Plaso storage file

Yes, I know. That's why I suggest to ignore the warning. The problem is that pinfo.py writes
[WARNING] (MainProcess) PID:14237 <tools> This version of plaso is more than 6 months old.
to stderr, but
WARNING: the version of plaso you are using is more than 6 months old. We strongly recommend to update it.
to stdout, and tasks.py dutifully forwards it to the json decoder - which cannot parse it and crashes.

I've tested this on the Python CLI:

  Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
  >>> import subprocess
  >>> cmd = ["pinfo.py", "--output-format", "json", "--sections", "events", "</path/to/plaso-file>", ]
  >>> command = subprocess.run(cmd,capture_output=True, check=True)
  >>> storage_counters_json = command.stdout.decode("utf-8")
  >>> print(storage_counters_json)
  WARNING: the version of plaso you are using is more than 6 months old. We
  strongly recommend to update it.

  {"storage_counters": {"parsers": {"filestat": 18447, "total": 4253686, "recycle_bin": 1, "prefetch": 923, "utmp": 523, "setupapi": 1086, "winreg_default": 341416, "windows_typed_urls": 9, "windows_run": 16, "pe": 362, "shell_items": 1626, "lnk": 930, "windows_sam_users": 12, "explorer_programscache": 4, "msie_zone": 72, "olecf_summary": 4, "olecf_default": 52, "mrulist_string": 9, "explorer_mountpoints2": 12, "mrulistex_string_and_shell_item": 19, "mrulistex_string": 18, "userassist": 110, "bagmru": 238, "winevtx": 581236, "network_drives": 2, "mstsc_rdp_mru": 1, "mstsc_rdp": 15, "mrulistex_string_and_shell_item_list": 1, "mrulistex_shell_item_list": 12, "amcache": 2158, "msie_webcache": 19452, "olecf_automatic_destinations": 313, "chrome_cache": 1635, "cups_ipp": 1, "chrome_preferences": 23, "firefox_cookies": 108, "google_analytics_utmt": 1, "google_analytics_utma": 3, "google_analytics_utmb": 1, "google_analytics_utmz": 1, "chrome_27_history": 43, "chrome_66_cookies": 420, "firefox_history": 152, "windows_boot_execute": 2, "appcompatcache": 754, "windows_timezone": 1, "windows_shutdown": 2, "windows_services": 592, "bam": 22, "srum": 246031, "windows_version": 3, "networks": 4, "windows_task_cache": 566, "winlogon": 4, "mft": 3034238}, "event_labels": {}}}

I think it's important to have some leeway regarding the L2T version - as long as the same version is used both for parsing and ingesting to Timesketch - to avoid being forced to switch L2T versions in the middle of a case.

You should be able to use the older version of Plaso, there are Dockerized versions available that include all dependencies.

That's not the problem; I've my archived VMs with older versions of L2T and including all dependencies. Still, the timesketch_importer tool chain will fail using an L2T version older than six months as long as the warning is not filtered out before ingesting pinfo.py's output to the json parser. As a workaround, I can backdate my system time, but that's a nasty hack.

Anyway - since you chimed in to this discussion: Maybe you could add a "suppress warnings on stdout"-option to pinfo.py?

the fields _parser_chain and _event_values_hash have been removed from the OpenSearch output, i.e. after uploading the latest collections with the new L2T version, I cannot use the same queries across all timelines anymore.

these are Plaso internal only values, can you describe why you want to query on these?

I often used them to restrict searches to specific types or groups of artifacts. Maybe I should have used other fields like source_long, source_short or data_type; I guess it was just because _parser_chainwas always listed at the top in Timesketch that I got used to use it. And, as it seems, I'm not the only one who fell for that bad habbit: The folks from blueteam0ps also used _parser_chain quite heavily in the rules of their tags.yaml file (see my issue there: https://github.com/blueteam0ps/AllthingsTimesketch/issues/12).

joachimmetz commented 1 year ago

Anyway - since you chimed in to this discussion: Maybe you could add a "suppress warnings on stdout"-option to pinfo.py?

Given the tooling and dependencies are continues evolving it is not a good idea to use older versions.

The folks from blueteam0ps also used _parser_chain quite heavily in the rules of their tags.yaml file (see my issue there:

parser_chain is not the way to do this, use data_type

bit4n6 commented 1 year ago

Anyway - since you chimed in to this discussion: Maybe you could add a "suppress warnings on stdout"-option to pinfo.py?

Given the tooling and dependencies are continues evolving it is not a good idea to use older versions.

I totally agree with you that we should keep your tools up-to-date. But reproducibility is a core requirement in forensics and therefore, once a case has been started with an (at that point in time) current toolset, it should be ensured that the case can be completed with it (and then start the next case with the then latest version available).
A forensic tool that worked fine at the beginning of a case should not fail from one day to another just because the clock ticked on.

The folks from blueteam0ps also used _parser_chain quite heavily in the rules of their tags.yaml file (see my issue there:

parser_chain is not the way to do this, use data_type

I guess that's my key takeaway from this issue.

joachimmetz commented 1 year ago

But reproducibility is a core requirement in forensics and therefore, once a case has been started with an (at that point in time) current toolset, it should be ensured that the case can be completed with it (and then start the next case with the then latest version available).

which would be the responsibility of the analyst, given that more factors play into this outside the control of any tool author, e.g. the system and its configuration on which the tool runs.

bit4n6 commented 1 year ago

But reproducibility is a core requirement in forensics and therefore, once a case has been started with an (at that point in time) current toolset, it should be ensured that the case can be completed with it (and then start the next case with the then latest version available).

which would be the responsibility of the analyst, given that more factors play into this outside the control of any tool author, e.g. the system and its configuration on which the tool runs.

Well - in this case it's out of the control of the analyst - because I, as an analyst, cannot stop the time from ticking on. If I do not change anything in the setup - nothing should change in the output (especially not to the effect that there's no output).

I think I've provided all the analysis for the bug so fixing it should really be a very minor effort and it makes timesketch_importer just a bit more robust.

joachimmetz commented 1 year ago

Well - in this case it's out of the control of the analyst - because I, as an analyst, cannot stop the time from ticking on.

This is not entirely true, you can keep a snapshot of the previous versions including most of the configuration, this is exactly what Docker does for you. You can make sure your workflows preserve data where needed. Especially if your workflow relies on intermediate formats like the Plaso storage file.

bit4n6 commented 1 year ago

Hmm - I think we're talking at cross purposes here. It is not about a change of version. It's the same version before and after. It's just that version 20230311 of pinfo.py, as of Sep 11, creates a warning to stdout and timesketch_importer can't handle it and crashes. No Docker container would stop that from happening (by the way: I'm already using a dockerized version of Timesketch).

joachimmetz commented 1 year ago

Hmm - I think we're talking at cross purposes here.

Not entirely, agree that in this specific instance the warning should not lead to breakage, however that does not change the fact that you can adjust/influence your workflow. Given that new things in format are uncovered continuously I'm wary of relying on older versions of tooling that contain known issues.

bit4n6 commented 1 year ago

Not entirely, agree that in this specific instance the warning should not lead to breakage, ...

OK, then we agree that this issue should be fixed? As to the question of if or when tools should be updated within a running case, there can be different opinions. I'd do it only as a last resort. Still: I'd always want to start a case with a current and clean toolset.

joachimmetz commented 1 year ago

Still: I'd always want to start a case with a current and clean toolset.

For my understanding define what you mean "clean". How do your analysis method ensure this "clean" state?

For context I started testing numerous tools out there mentioned by others as "de facto" tools, finding out they are fundamentally broken. Is "clean" supposed to ensure tooling meets certain test criteria? If so can these be shared?

Or do you mean deployment testing? Or continued monitoring/testing of a deployment?

blueteam0ps commented 1 year ago

@bit4n6 did you manage to find a workaround for this?

bit4n6 commented 1 year ago

@blueteam0ps : Yes - you'll find the description at the top of this thread - below the "TL;DR" headline. BTW: Thanks for updating your tags.yaml so quickly!

jaegeral commented 10 months ago

Chiming in here a bit, at some point, we need to cut backwards compatibility for import support. Some might say 6 months is to short, for other it does not bother them. Plaso is developing quick, so we accept that kind of the cut off. I agree that we could do better to improve telling users / analysts that.

mpilking commented 8 months ago

Big thanks for documenting this @bit4n6. You just saved me a ton of time!

FYI, the reason I'm using a version of plaso that is more than 6 months old is because I'm using the latest official release of Timesketch (20231206) and that is using plaso 20230717 according to tsctl info. We are now 10 days past the 6 month mark on that version of plaso, so I hit this bug.

jkppr commented 8 months ago

Thanks for raising the issue. The problem is fixed in the current latest timesketch version (master branch) and was also fixed upstream in the Plaso code base. Our next Timesketch release is scheduled for 2024-02-07 and will include those fixes as well.