Open fledder opened 9 years ago
I haven't thought about it only for the reason that I haven't had a need. I would love for someone else to jump in an work on something to help expand out the capabilities.
I should already have some examples of sending JSON straight to a TCP port.
I think the most appropriate approach might be to write something a little generic that sends the output to a TCP port in the format you specify. It would be more work but seems like it would be more generic and useful across the board.
Let me take a look at what I have on my local repo this weekend and I'l try to make sure I've got my latest codeset up here so you can branch it and add this function.
Oh and one more thing is that my preference would also be to write this as a standalone App and not a core part of the library. I want to keep the library to just a library. Now, if we need to add something to generate a specific output format for a particular logging engine I think that's ok but the mechanics of transmission should be in an external app.
THX!
-Andy
I still have yet to dive in, so forgive any misunderstandings, but my vision for how I would use it would be to create a config file with the desired log source and output format, then run the forwarder as a service. I'm still trying to think if it would be better to have different services for JSON vs arbirtrary TCP / UDP vs syslog etc. or have that be configurable as well. Or we could just do a single "template" file that would contain the JSON markup or syslog or whatever, and use replacement variables for each of the log fields.
Do you do any filtering before forwarding?
I think we are generally on the same page with the concept of a forwarding service. One thing I might encourage is to see if we can find something that does this already with just stdout from a console app.
Specifically I'm wondering if we could leverage something like http://www.rsyslog.com/ or a similar existing solution. Then we could spend most of the time writing up great tutorials and maybe keep the actual console app required down to an ultra minimalist solutions, leveraging the great work of others for formats, filtering, etc.
As for filtering I do have some filtering capabilities built-in now.. but to be fair I'd need to go back and look at the library to remember exactly how I do it :-).
There is a logstash-forwarder which has the advantage of being native to Logstash, but there don't seem to be any binaries distributed for Windows. The one I started with is nxlog which has a neat setup where you specify the plugin you use to retrieve the data and how it gets forwarded.
I believe both of those will read from a file, but that's a bit clunky to repeat everything to a file only to send it on again. I think they both support stdin as well, but I'm not as clear on how that works. I haven't used rsyslog, but it looks more open-sourcey than nxlog, which does the community vs enterprise thing I find so infuriating.
I'll probably start with nxlog and see where I get. It may turn out that we only have to provide some config files for nxlog and logstash to handle formatting, though I would like to be able to have a single install script that would make it easy to set up. I'd also like to look at somehow extending nxlog to add a plugin for it, though I have no idea if that's feasible. I'll also take a look at rsyslog to see if it's easier to write a plugin for that.
I'm torn on the idea of filtering... it seems to me that you shouldn't log it locally if it's not worth forwarding, which would make me think that the appropriate place to filter is in SMC, then in Logstash if necessary.
Have you ever written a Windows service in C#?
One way or another I'll prob get into it sometime this month, I hope. I'm sure you're familiar with the schedules of the consulting business...
After a bit of poking about with rsyslog, it looks like the open source version is Linux-only, and the Windows version is paid. I'll focus on nxlog for now.
Sounds like a good direction to me.
I have an initial prototype working with nxLog. I'll throw some documentation in there and try to figure out this github pull request thing... prob this weekend you should see something.
Any progress to post? I will be happy to integrate your work without a PR
Having dropped off the project that was driving my interest in 2015, and now back on a project that this would be good for... I am picking this back up, hopefully. I think this may be easier now as Elastic now has beats (forwarders) that can ingest a file, one of which claims to ingest data from stdin. I'll poke around a bit and see what I can figure out.
Sounds good. Let me know if you need any support from my end. As you can tell there hasn't been much activity on the core library in a while. That's not a sign of lack of interest, it's more that there really isn't much else to add or change unless someone raises a PR for a bug or new feature.
Apologies for the questions... I barely remember anything about this from last time. Would the approach be for me to create an example project within the current solution showing how to create a Windows service that will use Filebeat to send data to elastic? Or is there a better way of doing it? I think I have Filebeat working with stdin, just not sure how to go about setting up the rest of it...
You are welcome to create a windows service but there might be easier ways these days. For my recent applications I have been using NSSM to run console apps as services. Also I don't know much about how you can run Filebeat but I know with Splunk you could configure something to run a command on an interval and then consume the output as a STDIN. Don't know if that would be easier.
@fledder Were you able to get Filebeat to work? I looked at that a few months ago and it appeared to only handle text files. So I've been working on a custom Beat to read the logs. It's been a big learning curve as I learned Go and how to integrate with Elasticsearch and Kibana along the way. I've had it running on a handful of VMs here in my office for a few weeks and it seems to work OK.
@arobinsongit, I was wondering if that would be a project you'd be interested in incorporating into this aaLog repository, and now here we are talking about similar things. The timing is great.
The project is currently in a private Github repository but I'd be happy to share it with both of you if interested. My boss and I have had some discussions about making that repository public, but we also think it would be fine to add it to this aaLog repository (if you'd like) to give back and to keep all of the log reading projects together. What do you think?
@logic-danderson all contributions welcome. I certainly appreciate the sentiment of giving back :-). Two approaches I could see are to fold it into this repo or I can put a link in the ReadMe. I'll let you make the call on what you think would work best. Happy to work with you any whatever approach you want to take.
@arobinsongit Sweet. I'm fine with including it into this repo if that works for you. Just not sure mechanically how to do that.
You think I know :-). Actually I think what you will need to do is fork the repo. Clone your forked report down to your box. Add your new code, commit, and then issue a pull request that I will accept. The other method would just be to send it to me but let's try to do it the proper gitty way first.
@logic-danderson Filebeat now handles stdin as well. Not sure how long that has been a feature. I actually just yesterday got it to work, with a Windows service running the Filebeat exe periodically. I'll clean up the code and push it to my github in a few mins here. I'm going to try and figure out how to write an installer before I do a PR here, but for now here are the main bits of info I used to figure it out:
https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-stdin.html https://discuss.elastic.co/t/filebeat-does-not-exit-on-eof-input-type-stdin/69545/2
A custom beat actually sounds really useful... my solution has a whole separate service to manage, so a new beat would probably be less overhead.
Basic working solution is up at https://github.com/fledder/aaLog/tree/aaLogElasticFileBeat but i have a few more things to do before I make a PR. Should be done this week.
Thank you both. I'll have to research how to integrate a Go project into a larger repository. I'm new to Go but it seems like it expects each project to be in its own repository. I'll investigate when I get time.
Short story is I decided to make the current repository public instead of trying to merge it into the aaLog repository. The repository is here: https://github.com/logic-danderson/aalogbeat
If you want the binary to test check the Releases tab or here: https://github.com/logic-danderson/aalogbeat/releases/tag/0.1
Since aaLog is a .NET solution and Aalogbeat is a Go project, I don't think merging them would go well. Go in general and the Elastic Beats tools in particular seem to be quite picky about the build environment. And you need to build this on Linux anyway (the readme shows to build for Windows despite being on Linux).
So for now, at least, I'm going to leave it where it is. Please take a look.
Thanks.
Aaahh - so you completely rewrote the library in Golang. I had been wanting to learn Golang and actually thought this would be a good project to do it. Looking forward to seeing how you did it. I'll add something to my ReadMe directing people to your library.
Here's what I'd be really curious about. Can you run a performance test to see how many messages/second you get? It's been a while for me but on a laptop with 32GB of RAM and an i7 processor I remember getting about 50k/second.
Pull request is in. It may not be a masterpiece but it appears to work...
@arobinsongit Good question. I hadn't been all that worried about benchmarking it.
I just ran a test on my development Linux VM that has 4 GB RAM and 2 CPUs. I copied a several aaLOG files over to it and timed how long it took to read them. In repeated executions it's taking about 1.3 seconds to read 75,544 records, so about the same as what you saw. That's just reading the records into memory, not sending them anywhere.
IRL the Beat wouldn't be that fast. it sends the records to Elasticserach in much smaller batches, so there's the network latency, waiting to confirm the previous batch was received before sending the next one, etc.
Have you given any thought to setting something up to forward to Logstash / ELK servers? I like Splunk, but the restrictions on throughput irk me, and every Wonderware system I've ever seen has a metric crapload of log messages happening.
Seems like it would be fairly easy, as the default for Logstash seems to be a JSON structure via TCP. Would need to set something up in the project as well as provide an input/filter for the Logstash server config.
Would this be something you'd be interested in having in this project if I get around to working on it? I'm certainly no professional programmer, but I'll take a crack at it if you're interested.