Closed sentinelleader closed 8 years ago
On Thu, Oct 29, 2015 at 7:17 PM, Guardian Sentinel <notifications@github.com
wrote:
Hey,
I've been using node-logstash for quite a while and its amazing :) All these times, i was using with a single config file. Now i had to use multiple configs, so ive started using the config_dir option. But ever since i start using the option, i'm seeing duplication of logs. The no. of duplicates exactly matches to no. of config files present in the config directory.
Each config has separate input files belonging to separate folders to read, they are not even of the same folder. lsof shows that each child process is reading other input files too :(
node 13516 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog SignalSen 13516 13519 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13520 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13521 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13522 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13523 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog
Each child process is accessing the same files. And it repeats for all other input files
What do you mean by child process ? Node-logstash does not uses multiple processes. May be you can show multiple threads instanciated by node it self, by no multiple processses.
I've now huge amounts of data, and since its getting multiplied like 6x, im filling like 500GB per day :(
Can you check you have no * which are covering the same file ? Can you provide an extract from your config : grep input * ?
Bertrand
— Reply to this email directly or view it on GitHub https://github.com/bpaquet/node-logstash/issues/117.
Sry, misread the lsof
output. They where indeed ThreadID's. Below are the input source details. Though i use wildcard for the files, their parent folders are different in each input source
app-backend-exceptions.json
:input://file:///mnt/log/logger/backend/exceptions/exceptions.jslog?type=logger
app-backend-latency.json
:input://file:///mnt/log/logger/backend/latency/latency.jslog?type=logger
app-backend-server-events.json
:input://file:///mnt/log/logger/backend/server-events/server-events.jslog?type=logger
app-backend-streaming-events.json
:input://file:///mnt/log/logger/backend/streaming-events/streaming-events.jslog?type=logger
app-backend-task-runner.json
:input://file:///mnt/log/logger/backend/task-runner/task-runner.jslog?type=logger
app-frontend-combined.json
:input://file:///mnt/log/logger/frontend/combined/combined.jslog?type=logger
I do not see wildcards on this config :(
yikes, looks like markdown removed it :(
basically it's *.jslog?type=logger
for each input
Hi,
When I edit your post, I see that
app-frontend-combined.json:**input://file:///mnt/log/logger/frontend/combined/**combined*.jslog?type=logger
What is your exact config ? Do you use double *
?
Can you post your config in gist or pastebin ?
Yes gist would be perfect, https://gist.github.com/sentinelleader/3384049825a5095164d2
Hi,
It seems your filters and outputs lines are duplicated. You have two solutions
Regards,
Bertrand
On Sun, Nov 22, 2015 at 2:43 PM, Guardian Sentinel <notifications@github.com
wrote:
Yes gist would be perfect, https://gist.github.com/sentinelleader/3384049825a5095164d2
— Reply to this email directly or view it on GitHub https://github.com/bpaquet/node-logstash/issues/117#issuecomment-158760644 .
Hey,
I've been using node-logstash for quite a while and its amazing :) All these times, i was using with a single config file. Now i had to use multiple configs, so ive started using the
config_dir
option. But ever since i start using the option, i'm seeing duplication of logs. The no. of duplicates exactly matches to no. of config files present in the config directory.Each config has separate input files belonging to separate folders to read, they are not even of the same folder.
lsof
shows that each child process is reading other input files too :(node 13516 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog SignalSen 13516 13519 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13520 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13521 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13522 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog node 13516 13523 root 28r REG 202,96 8545741 5508505 /mnt/log/xxx/server-events-2015-10-29.jslog
Each child process is accessing the same files. And it repeats for all other input files
I've now huge amounts of data, and since its getting multiplied like 6x, im filling like 500GB per day :(