Closed boingomw closed 4 months ago
Same issue as https://github.com/google/timesketch/issues/2334
@boingomw I see that you have 2 imports reported. This indicates that you ran the importer twice with the same timeline name. That will put the events in the same timeline.
Can you confirm that this isn't the case? I can't reproduce this issue on my end.
I did run it twice. once for the .json file and once for the .plaso file. The issue is that the files both had the same amount of lines in them, but when I imported the json file, it ended up having 2x the number of events.
so when you do the process above you end up with 7k for the .json and 7k for the .plaso file?
I can confirm here that timesketch_importer is also creating doubled sources for JSONL imports, doubling the events on searches. The same does not apply to web imports, that seems to import correctly.
# timesketch_importer --version
API Client Version: 20230721
Importer Client Version: 20230721
If you need a sample jsonl, I can supply.
Regards,
I just tried it with timesketch --sketch 1 import /usr/local/src/timesketch/temp/sigma_temp.jsonl
(the CLI tool) the content of the file being (https://github.com/google/timesketch/blob/master/test_tools/test_events/sigma_events.jsonl):
{"message": "A message","timestamp": 123456789,"datetime": "2015-07-24T19:01:01+00:00","timestamp_desc": "Write time","extra_field_1": "foo"}
{"message": "Another message","timestamp": 123456790,"datetime": "2015-07-24T19:01:02+00:00","timestamp_desc": "Write time","extra_field_1": "bar"}
{"message": "Yet more messages","timestamp": 123456791,"datetime": "2015-07-24T19:01:03+00:00","timestamp_desc": "Write time","extra_field_1": "baz"}
{"message": "Install: zmap:amd64 (1.1.0-1) [Commandline: apt-get install zmap]","timestamp": 123456791,"datetime": "2015-07-24T19:01:03+00:00","timestamp_desc": "foo","command":"Commandline: apt-get install zmap","data_type":"apt:history:line","display_name":"GZIP:/var/log/apt/history.log.1.gz","filename":"/var/log/apt/history.log.1.gz","packages":"Install: zmap:amd64 (1.1.0-1)","parser":"apt_history"}
{"message": "[11 / 0x000b] Source Name: Microsoft-Windows-Sysmon Strings: ['DLL', '2022-01-22 23:07:43.492', '{C784477D-8DE8-61EC-AAAA-000000003C00}', '7812', 'C:\\Windows\\tifubjdl\\lysjbpb.exe', 'C:\\Windows\\itfnduuui\\Corporate\\mimilib.dll', '2022-01-22 23:07:43.492'] Computer Name: DESKTOP-B0TAAAA Record Number: 913 Event Level: 4","computer_name":"DESKTOP-B0TAAAA","data_type":"windows:evtx:record","datetime":"2022-01-22T23:07:43.502205+00:00","display_name":"OS:/data/input/Microsoft-Windows-Sysmon%4Operational.evtx","event_identifier":"11","event_level":"4","message_identifier":"11","parser":"winevtx","source_name":"Microsoft-Windows-Sysmon","timestamp":"1642892863502205","timestamp_desc":"Creation Time" }
And I got a new timeline with 5 events
Maybe it's volume related and 5 isn't enough lines to trigger
@jaegeral , try this. password: sample123 It has 484 events, but doubles up on importing.
Regards
@jaegeral I just realized that you're using timesketch cli instead of timesketch-import-client (timesketch_importer). Is there any difference on the approaches?
Hm indeed, it is importing them twice.
fwiw, I am still working on this, it seems my e2e tests in https://github.com/google/timesketch/pull/2976 does not trigger it.
Still seeing this bug in the latest version of TS. Looking at the code this flush call isn't needed since the stream close method calls flush()
already. We are seeing duplicates because flush()
is called twice (and the _data_lines
buffer isn't cleared directly by flush()
which makes the method name a bit misleading).
Describe the bug timesketch_importer runs twice when executed on json_line files, resulting in double events.
To Reproduce Steps to reproduce the behavior:
log2timeline.py --storage_file example.plaso /usr/bin
pinfo.py example.plaso
Expected behavior Expected it to not double import
Screenshots
Desktop (please complete the following information):
latest docker install, as of 6/15/2023