Closed lxndrblz closed 3 years ago
Can you provide an example of the code you're using that generates this result? Is the input data indexeddb or just a "generic" leveldb?
@cclgroupltd Thanks for your swift response. The (simplified) code I am using looks like this:
import click
from pathlib import Path
from ccl_chrome_indexeddb import ccl_leveldb
def read_input(filepath):
# Do some basic error handling
if not filepath.endswith('leveldb'):
raise Exception('Expected a leveldb folder. Path: {}'.format(filepath))
p = Path(filepath)
if not p.exists():
raise Exception('Given file path does not exists. Path: {}'.format(filepath))
if not p.is_dir():
raise Exception('Given file path is not a folder. Path: {}'.format(filepath))
parse_db(filepath)
def parse_db(filepath):
try:
db = ccl_leveldb.RawLevelDb(filepath)
except Exception as e:
print(f' - Could not open {filepath} as LevelDB; {e}')
try:
for record in db.iterate_records_raw():
print(record.value)
print("*"*20)
except ValueError:
print(f'Exception reading LevelDB: ValueError')
except Exception as e:
print(f'Exception reading LevelDB: {e}')
# Close the database
db.close()
@click.command()
@click.option('--filepath', '-f', required=True,
help="Path to the IndexedDB")
def cli(filepath):
read_input(filepath)
if __name__ == '__main__':
cli()
Note: The f parameter will be the path to the IndexedDB folder, such as:
C:\Temp\https_teams.microsoft.com_0.indexeddb.leveldb
In there are all of my .ldb
files and the metadata, such as the manifest and log files.
Yes, the data I am passing to the script is an IndexedDB and not one of the generic ones.
So it looks like you're using the raw access to leveldb here rather than the indexeddb functionality which is why it's not doing any decoding. I would suggest looking at the "Using the Modules" section in the readme: https://github.com/cclgroupltd/ccl_chrome_indexeddb#using-the-modules
Hi,
Thanks for your efforts in developing this code and your blog posts! It is much appreciated.
I am using your code in one of my forensics projects for extracting conversation artefacts from an Electron-based communication platform. While the enumeration works incredible reliable, I have the impression that the
record.value
are not fully decoded.My response looks like this right now:
From your blog post I took away that each of the recurring tags, such as
"
or{
give an indication for the data that follows and their data types. I am now wondering if this object encoding has already been implemented or if I did something wrong?Currently, I am working around this issue by splitting the record's value based on the "-character and ignoring the first byte after the split. While this works in most cases, it does not seem ideal, as it fails if a record contains a nested json array.
Please let me know if you need any additional details. I would be willing to share my test leveldb, as it contains only staged entries and nothing secretive.