Closed bdsmith48 closed 6 years ago
Hi @bdsmith48,
I just tried this and it worked just fine here - only thing is that print doesn't make it as nice to read as pprint but that's just a minor thing. A couple of questions to see if I can help you further.
I've found that depending on the host (RAM, CPU etc) and size of logfile it can take some time to read in the file. If this is your own file - what size is the file/how many lines does it contain?
Have you tried any other file that are bundled with Bat in the data subdirectory?
Cheers, Mike
I've tried it on windows with python 2.7.6 and on a Mac but I can't currently check the version I was using on the Mac
I don't think I got 2.7.6 installed anywhere at the moment so can't replicate right now.
If you just set it to read in the log, does it ever finish - if you run it in the interactive Python shell for example?
Have you tried other logs than the ssh.log?
@bdsmith48 @swedishmike My guess here is that ssh.log is an empty file or something. @bdsmith48 please try this with the files included in this repository under data/
is there a reason why this wouldn't work with other bro log files? I know my files aren't empty and they aren't extremely long. It turns out that it did work for your log files though.
I've used it with log files from quite a number of various installations - as well as some publicly available ones where I don't even know which versions of Bro they were generated so it should work with yours too.
Three quick questions...
Cheers, Mike
P.S Just another, really silly, question - you are running Bro with the default, tab separated, logs and not Json logs?
@swedishmike good point about the json logs. @bdsmith48 if you have json logs they are currently not supported but on our todo list. See https://github.com/Kitware/bat/issues/40
@bdsmith48 Are you still having issues with this or did you get it figured out?
I mailed same error to brian.wylie@kitware.com but it was responded that there is no such mail. I saved a 2k row file as bro. Then i am tyring to load it to data frame with command below. My problem is nothing happens in 1.5 hour and cpu is always %100. What can cause this? Any mismatch between Bro versions? What version of Bro is required for Bat? I am using python 3.6. btw bro_df = LogToDataFrame('/home/seckindinc/Desktop/Projects/Bro/bro')
@seckindinc When you say that you 'saved it as bro' what do you mean?
Would it be possible for you to share this file so that I or @brifordwylie could give it a go in our environments?
I run Python 3.6 as well and have used logfiles from Bro 2.5.x without any problems.
I tried to mean that i saved a small portion of raw log file under the name of "bro". I can't share log with you because of confidentiality. Can you share your Bro version or parser?
Did you leave in the headers etc in the file?
Are your logs in tab separated format or JSON?
Which log file are you trying to parse at the moment? Conn, ssl, http etc?
Also, can you parse the example files that comes with Bat?
I'm running Bro v 2.5.1 and 2.5.2 on various machines at the moment.
There is no head in file. It is tab separated. I have multiple types of Bro logs. I am assuming that Bro automatically parse this? Didn't try yet. I will soon.
I have multiple types of Bro logs. I am assuming that Bat* automatically parse this?
I have done with your examples. I think i need to check my log files for format issues.
I think I might have replicated your issue. I removed the headers from one of the test files and now it doesn't load properly.
I'm in the middle of some Christmas celebrations here so can't fully verify in the code right now but my guess is that the headers are used to verify what file it is that's being opened and what fields it contains. I'm sure @brifordwylie can confirm whether or not this is true once he sees these messages and have a minute to spare over the holidays.
If you want to test you can take the headers from your original log file and add them to your exported one.
Just to confirm - what I call headers are the following lines, this example is from the dns.log file:
#separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path dns
#open 2014-04-03-10-08-27
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto trans_id query qclass qclass_name qtype qtype_name rcode rcode_name AA TC RD RA Z answers TTLs rejected
#types time string addr port addr port enum count string count string count string count string bool bool bool bool count vector[string] vector[interval] bool
Please let me know if that makes any difference.
When i remove every detail except data, it didn't work for me either. I think bat requires column names and product type.
So if you leave the lines starting with # from the top of the file it works?
As I said in the previous comment - I think this is what's used to ascertain what file it is as well as get the field names and so on.
Thank you so much for your help. This package works great if we give detailed info about the log.
@seckindinc No worries at all - I'm glad I could help you.
@bdsmith48 - Just to check - could this be the solution to your issue too?
@swedishmike @seckindinc Yes, the reader reads in the Bro Headers. All bro versions should be putting out headers on the files. If you cut/paste some of the rows into another file you'll need to include the headers as well. I'm going to close this ticket, If @bdsmith48 wants to reopen we will.
My code just hangs when trying to use bat on a log file
from bat import bro_log_reader
reader = bro_log_reader.BroLogReader('ssh.log') for row in reader.readrows(): print(row)
this code never completes or prints out anything, it is having problems with the for loop