google-research / timesfm

TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
https://research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting/
Apache License 2.0
3.02k stars 228 forks source link

Is this normal to use 12G GPU memory to load the 200m model with default parameter? #29

Closed keefeleen closed 1 month ago

keefeleen commented 1 month ago

Our code (following the sample)

model = timesfm.TimesFm(
    context_len=128,
    horizon_len=5,
    input_patch_len=32,
    output_patch_len=128,
    num_layers=20,
    model_dims=1280,
    backend=backend,
)
model.load_from_checkpoint(repo_id="google/timesfm-1.0-200m")

then our process gives the GPU memory usage around 12237MiB

image

blackcat1402 commented 1 month ago

that is a lot

R3xpook commented 1 month ago

I have 64gb of ram and 64 of swap and it oom XD sooooo

rajatsen91 commented 1 month ago

Can you try setting the environment variable: XLA_PYTHON_CLIENT_PREALLOCATE=false ?