bbcarchdev / twine

An RDF workflow engine
https://bbcarchdev.github.io/twine/
Apache License 2.0
8 stars 2 forks source link

Twine accumulates memory #30

Closed simeonvandersteen closed 8 years ago

simeonvandersteen commented 8 years ago

After a few days of running..

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 5816 root      20   0 4978996 4.592g   1836 S 90.4 62.7 538:25.30 twine-writerd
 5341 root      20   0  740228 445864   1756 S  7.6  5.8 216:22.52 fakes3
 7428 puppetm+  20   0  232956  95564  92728 S  0.7  1.2 144:58.05 postgres

This is a machine with 7.5GB of memory.

nevali commented 8 years ago

This could really do with being dug into a bit more to have any hope of tracking it down — in particular it's important to identify whether it's twine-writerd itself, or one of the modules (in which case, which one?)

cgueret commented 8 years ago

That actually had to do with Spindle. Besides a small leak we noticed that the message queue was not informed about the presence of a cluster (https://github.com/bbcarchdev/spindle/issues/83) and was thus not using it. This caused a lot a retries which in turne caused an accumulation of memory usage and duplicate triples pushed to S3. This because librdf allows for duplicate NQuads and the data loaded from S3 was not cleaned between each retry. We fixed that by making use of the cluster and be sure that in case of a retry the data is cleaned before being reloaded. We also saw we could save a bit of memory by not loading the many incoming links on a MEDIA update.

The relevant commits are: https://github.com/bbcarchdev/spindle/commit/1206bd78f0430b78ca9d40b3a6d9fc1f9ffe3b5f https://github.com/bbcarchdev/spindle/commit/0e4e57f96a1ff10b7248a7c9bb3645ff8ebb9399 https://github.com/bbcarchdev/spindle/commit/fd30b2f96c5a5d2f7f62e82fe3d5fcb271a18e73 https://github.com/bbcarchdev/spindle/commit/801b78c2cdf734002646f96bc5d375dc1986767f https://github.com/bbcarchdev/spindle/commit/5ef8a0225f83908bec298ec1c5aac9b288fe99e8 https://github.com/bbcarchdev/twine/commit/0b3762d17a7ee7b5b0f16f81e07338adad0441db