Open LatVAlY opened 3 months ago
Hey @LatVAlY! 🌟 I'm here to assist you with any bugs, questions, or contributions you have for the repository. Let's work together to resolve this issue!
To resolve the issue with TrueLens causing a memory spike and pod restart on your cluster after initialization, you can use the memory_suggestions
context manager to catch out-of-memory issues and provide suggestions for reducing memory usage. Here is an example of how you can use it:
from trulens.nn.backend import memory_suggestions
# Example usage
with memory_suggestions("batch size=1000"):
# Your memory-intensive code here
pass
Additionally, consider reducing memory-impactful parameters such as batch size, intervention sizes, and data types. The rebatch
function can help you to rebatch your data into smaller batches:
from trulens.nn.backend import rebatch
# Example usage
for batch in rebatch(your_data, batch_size=500):
# Process each batch
pass
Moreover, you can use custom memory management techniques as demonstrated in the CustomMemory
class from the trulens_eval
library. This class allows you to control memory allocation and delay, which can help in managing memory usage more effectively:
import sys
import time
class CustomMemory:
def __init__(self, delay: float = 0.0, alloc: int = 1024 * 1024):
self.alloc = alloc
self.delay = delay
# keep a chunk of data allocated permanently:
self.temporary = [0x42] * self.alloc
self.messages = []
def remember(self, data: str):
if self.delay > 0.0:
time.sleep(self.delay)
self.messages.append(
data +
f" and I'm keeping around {sys.getsizeof(self.temporary)} bytes"
)
By using these tools and techniques, you can manage memory usage more effectively and prevent memory spikes that lead to pod restarts [1][2].
To continue talking to Dosu, mention @dosu.
Hi @LatVAlY - we aimed to address the root of this problem with TruLens 1.0. Was this ever resolved? If not, can you let us know if it still occurs with TruLens 1.0?
Bug Description after adding logs after the TruLens initialization tru = Tru() the trueLens is initialized but it spikes memory to a level where the pod runs out of memory, disabling TrueLens everything works fine (POD).
I have increased memory limit to 2G but that didn't resolve the issue either Environment: trulens-eval = "^0.31.0"
I don't know what other additional info I should put here to help resolve or debug the issue, but you need more info happy to provide