Closed zakwatts closed 2 weeks ago
Yes, we are aware that it is larger than the GB. For now we zipped it to store it, but when it is downloaded it needs to be unzipped.
is there a way to reduce the size? Perhaps more than just the weights are saved?
It does slow down the CI pytest as it needs to download the model to run.
We discussed this issue and we already tried to make the model as small as possible by storing it as .ubj (universal binary json). We think the reason is simply that the model has more parameters. The model is quite large using a lot of trees. I understand that the demoration of the tests is annoying. For the application itself it should not be such a problem, because the downloading and unzipping only happens the first time you use it. What do you think if we remove the xgboost model from the tests? Or maybe better, find another solution for the tests? Just looking at the tests again, I realized that in test_forecast.py
the xgboost model is not called. If we want to include it here we need to change predications_df_xgb = run_forecast(site=site, ts=ts)
to predications_df_xgb = run_forecast(site=site, model="xgb", ts=ts)
.
Let me know, if it is enough for you to solve this issue with the tests changed.
Thanks @froukje, ill close this for the moment.
Currently the Tryolabs model to download is 1.13Gb
Downloading model ... Downloading... From (original): https://drive.google.com/uc?id=1O34gyQ67rvrP9VFkNaagTDM9IP4iqAjM From (redirected): https://drive.google.com/uc?id=1O34gyQ67rvrP9VFkNaagTDM9IP4iqAjM&confirm=t&uuid=48065f82-5d7e-49e2-ac5c-095c7a17b40d To: [/home/zak/projects/Open-Source-Quartz-Solar-Forecast/examples/model_10_202405.ubj.zip](https://vscode-remote+ssh-002dremote-002bzak-002dresearch.vscode-resource.vscode-cdn.net/home/zak/projects/Open-Source-Quartz-Solar-Forecast/examples/model_10_202405.ubj.zip) 100%|██████████| 1.13G/1.13G [00:13<00:00, 81.5MB/s] Preparing model ... Loading model ... Predictions finished.
This seems large and perhaps it could be optimised to only download whats needed. For reference the gradient boosted model is around 400kb