tsenst / lightning-experiments-logger

A SageMaker Experiment logger class for PyTorch Lightning
https://medium.com/idealo-tech-blog/experiment-tracking-with-aws-sagemaker-and-pytorch-lightning-68b22fd4deee
Other
6 stars 0 forks source link

Truncate strings to 2500 and remove newlines #2

Open martinber opened 8 months ago

martinber commented 8 months ago

I was having a problem where the logger would fail since I was trying to log hyperparameter strings longer than 2500 characters and with newlines.

As a quick fix I solved it by adding this after line 38:

         if not isinstance(result[keys], (int, float)):
             result[keys] = " ".join(repr(result[keys]).splitlines())[:2499]

I don't know if you have a better idea. I have to do this because there is code which I do not control, which does something like:

import pytorch_lightning as pl

class MyModel(pl.LightningModule):
    def __init__(self, learning_rate: float, submodule: pl.LightningModule):

        self.save_hyperparameters()

Looks like save_hyperparameters() saves as hyperparameters all the arguments of the __init__() and arguments of type pl.LightningModule are serialized and saved in SageMaker experiments as a very long string with newlines.

The error indicating that the string would be shorter than 2500 is clear to understand, but at the same time I was receiving an error saying something of the sort of "hyperparameter string should match the regex '.*'" which was due to newlines and not very clear to understand

tsenst commented 7 months ago

Thank you for bringing this up. I will take a look into it.

tsenst commented 7 months ago

@martinber can you post here the complete error message please? A quick fix for your case would be to exclude parameters saved by save_hyperparameters(). How to do this is described here: https://lightning.ai/docs/pytorch/1.6.3/common/hyperparameters.html

There are two options. You can define which parameters should be saved

self.save_hyperparameters("learning_rate")

Or exclude parameters from being saved

self.save_hyperparameters(ignore=["submodule"])

meanwhile I will take a look how to deal with very large hyperparameters. I would like to avoid automated pruning, since this can result to unexcpected behavior.

martinber commented 7 months ago

Thank you!

This is the error when the hyperparameter is too long and has newlines:

Traceback (most recent call last):
  File "/opt/my_project/src/utils/sagemaker_logger.py", line 164, in log_fun
    self._sagemaker_run.close()
  File "/usr/local/lib/python3.10/dist-packages/sagemaker/experiments/run.py", line 537, in close
    self._trial_component.save()
  File "/usr/local/lib/python3.10/dist-packages/sagemaker/experiments/trial_component.py", line 121, in save
    return self._invoke_api(self._boto_update_method, self._boto_update_members)
  File "/usr/local/lib/python3.10/dist-packages/sagemaker/apiutils/_base_types.py", line 226, in _invoke_api
    api_boto_response = api_method(**api_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/botocore/client.py", line 553, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.10/dist-packages/botocore/client.py", line 1009, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the
UpdateTrialComponent operation: 6 validation errors detected: Value 'HERE THERE IS A
MULTILINE STRING
...
...
...
UNTIL HERE' at 'parameters.patch_module.member.stringValue' failed to satisfy constraint: Member must satisfy regular expression pattern: .*; Value 'HERE THERE IS
ANOTHER MULTILINE STRING
THAT IS VERY LONG
...
...
...
UNTIL HERE' at 'parameters.patch_module.member.stringValue' failed to satisfy constraint: Member must have length less than or equal to 2500

(I cropped the real message because I have lots of hyperparametrs giving similar errors)

Yes, I'm not sure of the best solution. The real cause is my code which saves some huge objects as hyperparameters which doesn't make sense, but I don't really have control over the code (it is a code that was logging to Neptune and now I modified to log to Sagemaker). I mean, I can make a PR or a patch myself to my code to save less hyperparameters but I think it would be nice if this library allows some kind of "best effort" mode where it truncates long strings and removes newlines.

martinber commented 7 months ago

Hello, I wanted to let you know that I don't need anymore this feature, so do it only if you think it is useful for you.

In case you are interested: I suspect that Sagemaker Experiments does not have a bright future and it is very unpolished right now. In any case, thank you, this library was very useful for doing some testing!

tsenst commented 7 months ago

Thanks for the feedback.