IKANOW / Aleph2-examples

Example projects demonstrating the Aleph2 platform and its supported technologies
Apache License 2.0
0 stars 1 forks source link

Logstash centralized logging issues #21

Open Alex-Ikanow opened 8 years ago

Alex-Ikanow commented 8 years ago

The "message" field was being written in "command"

(Longer term: we really need to launch an aleph2 friendly external process that then wraps the logging like the v1 version and stops it asap once the right number of records/logs have been received - currently it keeps dumping GB of logs each time)

Alex-Ikanow commented 8 years ago

Another error that probably comes from my attempts to tidy up the big log files after a test has completed:

Error getting logstash test output: [/tmp/aleph2_testing_logs_checkpoint_firew__10a226573f38: NoSuchFileException]
:[UnixException.java:86:sun.nio.fs.UnixException:translateToIOException]
[Files.java:3785:java.nio.file.Files:lines]
[LogstashUtils.java:249:com.ikanow.aleph2.harvest.logstash.utils.LogstashUtils:sendOutputToLogger]
[LogstashHarvestService.java:138:com.ikanow.aleph2.harvest.logstash.services.LogstashHarvestService:onUpdatedSource]
[DataBucketHarvestChangeActor.java:355:com.ikanow.aleph2.data_import_manager.harvest.actors.DataBucketHarvestChangeActor:lambda$null$80]
[-1:com.ikanow.aleph2.data_import_manager.harvest.actors.DataBucketHarvestChangeActor$$Lambda$1243:apply]
[Patterns.java:93:com.ikanow.aleph2.data_model.utils.Patterns$Matcher:when]
[DataBucketHarvestChangeActor.java:354:com.ikanow.aleph2.data_import_manager.harvest.actors.DataBucketHarvestChangeActor:lambda$talkToHarvester$88]
Alex-Ikanow commented 8 years ago

The problem filling up the disk is that the main Aleph2 service is delete unclosed file handles, causing a resource leak:

> lsof | grep tmp
16432 /tmp/aleph2_testing_logs_checkpoint_firew__d57f5d0770ef (deleted)
java       2156        tomcat  166r      REG             202,64     192764                                                                   16440 /tmp/aleph2_testing_logs_checkpoint_firew__10a226573f38 (deleted)
java       2156        tomcat  168r      REG             202,64 1584912877                                                                   16363 /tmp/aleph2_testing_logs_checkpoint_firew__d57f5d0770ef (deleted)
java       2156        tomcat  169r      REG             202,64 1480907270           

From slack: in here

Probably need the construct:

try (lines = Files.lines(...)) {
   lines.etc(...)
}

(easy enough to test, just run test a few times and check that there are none of those orphaned files via the lsof line)

while you're there, make it not exception if the log file doesn't exist

(FYI: you can remove the "should delete" comment at the bottom, that's handled here