SafeAF / enterprise-log-search-and-archive

Automatically exported from code.google.com/p/enterprise-log-search-and-archive
0 stars 0 forks source link

Cannot tell that logs are being loaded #6

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I have loaded a node and a web using the install.sh script and when I tail 
/var/log/elsa/node.log I get what looks like proper activity as far as I can 
tell.  I am attaching an excerpt from node.log so you can see if I am missing 
something.

There are definitely lots of logs arriving on the system as verified by tcpdump.

Yes, no matter what query I do I always get an empty result set even using an 
IP that I have seen with tcpdump as having arrived from evtsys on a windows 
server.  I have other firewall logs which the system does not know how to parse 
which I expect at this time to be missing.

Also, the stats pages shows no information on the graphs for stats: load, 
stats: index and stats: archive.

Something seems to be awry but I am not sure how to best troubleshoot.

Can you please help me sort this out?

Original issue reported on code.google.com by edavi...@gmail.com on 10 Feb 2012 at 5:41

Attachments:

GoogleCodeExporter commented 9 years ago
Also, I am attaching an excerpt from web.log for the last query I tried to run 
with "10.139.0.2" as the query.  I tried to manually run all of the queries 
from the log and they all returned 0 rows.

Original comment by edavi...@gmail.com on 10 Feb 2012 at 5:48

Attachments:

GoogleCodeExporter commented 9 years ago
Ok, according to the log files, no logs are making it into ELSA.  Since you've 
verified logs are arriving at the system, it's possible that syslog-ng isn't 
piping them to ELSA for some reason.  One cause may be that the OS's syslog is 
already running and listening, which is blocking syslog-ng.  A way to check is 
to look at the syslog-ng log file in /data/elsa/log directory and verify that 
the destination "d_elsa" has logs processed.

Original comment by mchol...@gmail.com on 10 Feb 2012 at 7:52

GoogleCodeExporter commented 9 years ago
Hmmm.  Good point.  I have reconfigured syslog-ng for ELSA to use port 5140 and 
reconfigured the log forwarder to send logs to port 5140.  I can verify on the 
sending side logs are being sent and on the receiving side (ELSA) that logs are 
being received on port 5140.  Still the node.log shows as it did before.  

Not sure how to check the syslog-ng log file since d_elsa writes to elsa.pl.  
Maybe a debug somewhere in elsa.pl to check traffic flow?

Original comment by edavi...@gmail.com on 10 Feb 2012 at 8:35

GoogleCodeExporter commented 9 years ago
The piece parts all seem to be working but no logs are being processed ...

[root@elsa log]# ps ax | grep syslog
23824 ?        S      0:00 supervising syslog-ng
23825 ?        Ss     0:03 /usr/local/syslog-ng/sbin/syslog-ng
27984 ?        Ss     0:00 syslogd -m 0
32448 pts/0    R+     0:00 grep syslog
[root@elsa log]# ps ax | grep elsa
10286 ?        Ss     0:00 /bin/sh -c perl /usr/local/elsa/web/cron.pl -c 
/etc/elsa_web.conf 2>&1 > /dev/null
10287 ?        R    190:29 perl /usr/local/elsa/web/cron.pl -c 
/etc/elsa_web.conf
32343 ?        S      0:02 perl /usr/local/elsa/node/elsa.pl -c 
/etc/elsa_node.conf
32552 pts/0    S+     0:00 grep elsa

Original comment by edavi...@gmail.com on 10 Feb 2012 at 8:49

GoogleCodeExporter commented 9 years ago
What do you see if you do lsof -p 23825?  Is syslog-ng indeed listening on 
5140?  What does your syslog-ng.conf file look like?

Original comment by mchol...@gmail.com on 10 Feb 2012 at 9:09

GoogleCodeExporter commented 9 years ago
I have restarted syslog-ng since then so the PID has changed but here are the 
equivalent results.

Yes, syslog-ng is listening on 5140:
tcp        0      0 *:5140                      *:*                         
LISTEN

And at the end of the LSOF output:
syslog-ng 11024 root    6u  IPv4           42637693              TCP *:5140 
(LISTEN)
syslog-ng 11024 root    7u  IPv4           42637694              UDP *:5140

Original comment by edavi...@gmail.com on 10 Feb 2012 at 10:23

Attachments:

GoogleCodeExporter commented 9 years ago
And here is lsof from the elsa.pl pid.

Original comment by edavi...@gmail.com on 10 Feb 2012 at 10:26

Attachments:

GoogleCodeExporter commented 9 years ago
If I add a new destination to the syslog-ng file as follows I get syslogs into 
the file.  But, nothing seems to be making it to elsa.pl.

destination d_file { file("/var/log/syslog-test.log"); };

log {
...
destination(d_file);
};

Original comment by edavi...@gmail.com on 10 Feb 2012 at 10:40

GoogleCodeExporter commented 9 years ago
Found an article mentioning syslog-ng-ctl as a potential utility for debug and 
it does help.  Shows that logs are being processed:

# /usr/local/syslog-ng/sbin/syslog-ng-ctl stats
SourceName;SourceId;SourceInstance;State;Type;Number
destination;d_elsa;;a;processed;12335
global;payload_reallocs;;a;processed;0
global;msg_clones;;a;processed;0
dst.program;d_elsa#0;perl /usr/local/elsa/node/elsa.pl -c 
/etc/elsa_node.conf;a;dropped;11076
dst.program;d_elsa#0;perl /usr/local/elsa/node/elsa.pl -c 
/etc/elsa_node.conf;a;processed;12335
dst.program;d_elsa#0;perl /usr/local/elsa/node/elsa.pl -c 
/etc/elsa_node.conf;a;stored;0
source;s_network;;a;processed;12335
global;sdata_updates;;a;processed;0
center;;received;a;processed;0
center;;queued;a;processed;0

Original comment by edavi...@gmail.com on 10 Feb 2012 at 11:14

GoogleCodeExporter commented 9 years ago
Ok, elsa.pl is definitely the problem, then.  Try this:
echo "test" | perl elsa.pl
Are there any error messages?

Original comment by mchol...@gmail.com on 11 Feb 2012 at 1:25

GoogleCodeExporter commented 9 years ago
Here is what I get with that command:

# echo test | perl /usr/local/elsa/node/elsa.pl
Use of qw(...) as parentheses is deprecated at /usr/local/elsa/node/Indexer.pm 
line 391.
Validating directory...
Use of uninitialized value in multiplication (*) at 
/usr/local/elsa/node/Indexer.pm line 778.
DBD::mysql::st execute failed: Column 'start' cannot be null at 
/usr/local/elsa/node/Indexer.pm line 874.
Can't locate object method "sql_error" via package "DBD::mysql::st execute 
failed: Column 'start' cannot be null at /usr/local/elsa/node/Indexer.pm line 
874.
" (perhaps you forgot to load "DBD::mysql::st execute failed: Column 'start' 
cannot be null at /usr/local/elsa/node/Indexer.pm line 874.
"?) at /usr/local/elsa/node/Indexer.pm line 891.

Original comment by edavi...@gmail.com on 13 Feb 2012 at 2:09

GoogleCodeExporter commented 9 years ago
Ok, run update-from-svn.sh to grab the latest code which has a small fix in it 
to remove the "sql_error" problem.  However, you will still get a "column 
'start' cannot be null" error.  In the /data/elsa/log/node.log file, what is 
given for "args:" just above this error?

Original comment by mchol...@gmail.com on 13 Feb 2012 at 2:51

GoogleCodeExporter commented 9 years ago
* DEBUG [2012/02/13 08:08:55] /usr/local/elsa/node/Indexer.pm (586) 
Indexer::load_records 24828 args: $VAR1 = {
          'file' => '/data/elsa/tmp/buffers/04hILtofcL'
        };

I did the update-from-svn.sh and received revision 183.

This is what I get this time:

# echo test | perl /usr/local/elsa/node/elsa.pl
Use of qw(...) as parentheses is deprecated at /usr/local/elsa/node/Indexer.pm 
line 391.
Validating directory...
Use of uninitialized value in multiplication (*) at 
/usr/local/elsa/node/Indexer.pm line 778.
DBD::mysql::st execute failed: Column 'start' cannot be null at 
/usr/local/elsa/node/Indexer.pm line 874.
DBD::mysql::st execute failed: Column 'start' cannot be null at 
/usr/local/elsa/node/Indexer.pm line 874.
[root@elsa conf]# tail -n 5000 /data/elsa/log/node.log | grep -A5 args:
* DEBUG [2012/02/13 09:27:12] /usr/local/elsa/node/Indexer.pm (586) 
Indexer::load_records 29297 args: $VAR1 = {
          'file' => '/data/elsa/tmp/buffers/04hILtofcL'
        };

Testing with real data ... will update shortly.

Original comment by edavi...@gmail.com on 13 Feb 2012 at 3:28

GoogleCodeExporter commented 9 years ago
# ls -al /data/elsa/tmp/buffers/04hILtofcL
-rw------- 1 root root 458055 Feb  8 14:17 /data/elsa/tmp/buffers/04hILtofcL

Original comment by edavi...@gmail.com on 13 Feb 2012 at 3:32

GoogleCodeExporter commented 9 years ago
Still a no-go on live data feed.  I tried a sample log entry that I forwarded 
in ticket 5 and still the same output as commend 13 above and no data showing 
in the database.
I do not know if this is the way it is supposed to be but there are no tables 
in the database syslog_data.  Not sure how that one is used but from the 
comments it looks like that is where the actual data for the logs is stored.  
If that is the case, there should be tables and the fact that they are missing 
is a bad indicator.  Is that maybe the reason for no log loading?

Original comment by edavi...@gmail.com on 13 Feb 2012 at 4:11

GoogleCodeExporter commented 9 years ago
There is an error during the create_table routine in which the tables
that actually store the data are created, which is why you're not
getting any data.  Run the update-from-svn.sh one more time and try
the echo "asdf" | elsa.pl again.  I think there may be a difference in
the way your version of either Perl or MySQL handles undefined values
at insert time.  The new code will have Perl set at time if none
given.

Original comment by mchol...@gmail.com on 13 Feb 2012 at 4:18

GoogleCodeExporter commented 9 years ago
There is an error during the create_table routine in which the tables
that actually store the data are created, which is why you're not
getting any data.  Run the update-from-svn.sh one more time and try
the echo "asdf" | elsa.pl again.  I think there may be a difference in
the way your version of either Perl or MySQL handles undefined values
at insert time.  The new code will have Perl set at time if none
given.

On Mon, Feb 13, 2012 at 10:12 AM,
<enterprise-log-search-and-archive@googlecode.com> wrote:

Original comment by mchol...@gmail.com on 13 Feb 2012 at 4:18

GoogleCodeExporter commented 9 years ago
Ok, now at version 184.  I have perl 5.14.2 on CentOS 5.6.

This time I get :

# echo "asdf" | perl /usr/local/elsa/node/elsa.pl
Use of qw(...) as parentheses is deprecated at /usr/local/elsa/node/Indexer.pm 
line 391.
Validating directory...
Use of uninitialized value $db_size in concatenation (.) or string at 
/usr/local/elsa/node/Indexer.pm line 94.
Use of uninitialized value in numeric gt (>) at /usr/local/elsa/node/Indexer.pm 
line 428.
Use of uninitialized value $db_size in concatenation (.) or string at 
/usr/local/elsa/node/Indexer.pm line 112.
Use of uninitialized value $db_size in addition (+) at 
/usr/local/elsa/node/Indexer.pm line 120.

And:

mysql> show tables;
+-----------------------+
| Tables_in_syslog_data |
+-----------------------+
| syslogs_archive_1     |
| syslogs_index_1       |
+-----------------------+

Maybe we are closer.  I am going to try live data ...

Original comment by edavi...@gmail.com on 13 Feb 2012 at 8:59

GoogleCodeExporter commented 9 years ago
The logs look different now and it seems to have settled a bit and is not just 
starting over and over and over again.  

Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)

using config file '/usr/local/etc/sphinx.conf'...
indexing index 'perm_200'...
collected 1 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 1 docs, 5 bytes
total 0.009 sec, 512 bytes/sec, 102.43 docs/sec
total 2 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 6 writes, 0.000 sec, 0.2 kb/call avg, 0.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=26465).
* TRACE [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (1292) 
Indexer::_sphinx_index 9859 ran cmd: /usr/local/sphinx/bin/indexer --config 
/usr/local/etc/sphinx.conf --rotate perm_200 2>&1
* INFO [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (1321) 
Indexer::_sphinx_index 9859 Indexed perm_200 with 1 rows in 0.11439 seconds 
(8.74156 rows/sec)
* TRACE [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (400) 
Indexer::_validate_directory 9859 Finished wiping indexes
* TRACE [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (1025) 
Indexer::_release_lock 9859 Unlocked directory
* INFO [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (254) 
Indexer::initial_validate_directory 9859 Loading 0 existing buffers...
* DEBUG [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (94) 
Indexer::_get_current_archive_size 9859 Current size of archived logs in 
database is 488612
* DEBUG [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (112) 
Indexer::_get_current_index_size 9859 Current size of indexed logs in database 
is 488612
* DEBUG [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (118) 
Indexer::_get_current_index_size 9859 Found index size of 1088100
* TRACE [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (1004) 
Indexer::_get_lock 9859 Locked directory
* TRACE [2012/02/13 15:01:57] /usr/local/elsa/node/Indexer.pm (1025) 
Indexer::_release_lock 9859 Unlocked directory
* DEBUG [2012/02/13 15:01:57] /usr/local/elsa/node/elsa.pl (126) main:: 9859 
Starting process_batch

* TRACE [2012/02/13 15:02:57] /usr/local/elsa/node/elsa.pl (199) main::__ANON__ 
9859 ALARM
* DEBUG [2012/02/13 15:02:57] /usr/local/elsa/node/elsa.pl (231) 
main::_process_batch 9859 Finished job process_batch with cache hits: 0 and 0 
new programs
* DEBUG [2012/02/13 15:02:57] /usr/local/elsa/node/elsa.pl (130) main:: 9859 
Processed 0 records
* DEBUG [2012/02/13 15:02:58] /usr/local/elsa/node/elsa.pl (126) main:: 9859 
Starting process_batch
* TRACE [2012/02/13 15:03:58] /usr/local/elsa/node/elsa.pl (199) main::__ANON__ 
9859 ALARM
* DEBUG [2012/02/13 15:03:58] /usr/local/elsa/node/elsa.pl (231) 
main::_process_batch 9859 Finished job process_batch with cache hits: 0 and 0 
new programs
* DEBUG [2012/02/13 15:03:58] /usr/local/elsa/node/elsa.pl (130) main:: 9859 
Processed 0 records
* DEBUG [2012/02/13 15:03:59] /usr/local/elsa/node/elsa.pl (126) main:: 9859 
Starting process_batch

But still no stats and not results ... yet ... I am going to let it run a bit 
and see if it "catches up" and starts working.

Original comment by edavi...@gmail.com on 13 Feb 2012 at 9:05

GoogleCodeExporter commented 9 years ago
This looks promising as something you changed has help this start working:

* TRACE [2012/02/13 15:59:56] /usr/local/elsa/node/elsa.pl (236) 
main::_process_batch 13060 Adding programs: $VAR1 = {
          'msexchangeis' => {
                              'class_id' => '4',
                              'id' => 1075099538
                            },
          'service_control_manager' => {
                                         'class_id' => '4',
                                         'id' => 2017122033
                                       }
        };
* TRACE [2012/02/13 15:59:56] /usr/local/elsa/node/elsa.pl (257) 
main::_process_batch 13060 inserted filename /data/elsa/tmp/buffers/wQJKk_FBp7 
with batch_counter 9547

Still no search results and stats graphs still showing blank for the data 
load/index/etc.  I guess I will let it sit for a while and see what it does.

Original comment by edavi...@gmail.com on 13 Feb 2012 at 10:05

GoogleCodeExporter commented 9 years ago
I am also seeing this in the logs from time to time:

* DEBUG [2012/02/13 16:15:06] /usr/local/elsa/node/elsa.pl (126) main:: 15052 
Starting process_batch
* ERROR [2012/02/13 16:15:36] /usr/local/elsa/node/elsa.pl (216) 
main::_process_batch 15052 Unable to parse log line: 1329171336 1.2.3.4    "    
   unknown                                                 .  Only parsed into:
$VAR1 = [
          '1329171336',
          '1.2.3.4',
          '"',
          1
        ];

Original comment by edavi...@gmail.com on 13 Feb 2012 at 10:17

GoogleCodeExporter commented 9 years ago
Ok, definitely have data being loaded now.  I can see the syslog_data tables 
growing and stats table has info:

| 2012-02-13 16:31:28 | load    |  6681776 |  9602 |    0.237067 |
| 2012-02-13 16:31:29 | archive |  6681776 |  9602 |    0.217388 |
| 2012-02-13 16:31:29 | index   |  6383789 |  9602 |     1.19389 |
| 2012-02-13 16:32:28 | load    |  5712966 |  8465 |    0.232577 |
| 2012-02-13 16:32:29 | archive |  5712966 |  8465 |    0.213469 |
| 2012-02-13 16:32:29 | index   |  5462087 |  8465 |     1.02827 |
| 2012-02-13 16:33:28 | load    |  5957047 |  8785 |    0.233034 |
| 2012-02-13 16:33:29 | archive |  5957047 |  8785 |     0.22022 |
| 2012-02-13 16:33:29 | index   |  5701927 |  8785 |     1.11837 |

But cannot get any query results from the web UI.  I will open a separate 
ticket on that as well as update issue 5 as the fields are not being parsed 
properly.

Original comment by edavi...@gmail.com on 13 Feb 2012 at 10:37

GoogleCodeExporter commented 9 years ago
What do you get when you run the MySQL query:
SELECT * FROM syslog_data.syslogs_index_1 LIMIT 10;

Do you see any logs saying "Processed n records" where n is greater than 0?

Original comment by mchol...@gmail.com on 14 Feb 2012 at 2:30

GoogleCodeExporter commented 9 years ago
No.  No rows with that string in them.

Original comment by edavi...@gmail.com on 15 Feb 2012 at 3:03

GoogleCodeExporter commented 9 years ago
Closing for now due to lack of activity.

Original comment by mchol...@gmail.com on 30 Apr 2012 at 1:53