Open Songloading opened 8 months ago
The problem seems somewhat related to 43092. but not exactly the same.
@Songloading This is a similar issue as https://github.com/ray-project/ray/issues/38522. The problem is that the local staging directory generates a datestring with the current time, and this collides when you have two tuners getting created one right after the other.
Can you change your code to create/call your 2 tuners consecutively?
tuner1 = Tuner(...)
tuner1.fit()
tuner2 = Tuner(...)
tuner2.fit()
Or, an alternative is to set a unique RunConfig(name)
for each tuner, since they seem to correspond to different experiments.
What happened + What you expected to happen
Hi there, I'm using ray.tune.Tuner to do some tuning jobs on my model. My case is that my trainable is just a skeleton so I can substitute models inside it based on different scenarios. However, I am not sure which model to use in the runtime so I will have to input all the parameters for all potential models and use a variable to control which model and its following parameters to go. This is the background. Now what I'm doing is that, say I have two different models, and I want to run them serially by instantiating two tuners and fit them. Everything goes well until I want to store the results to different paths using ray.train.RunConfig For example in the code snippet, I expect both directories to have 4 different sub-directories containing different results, but I do see the second directory containing the sub-directories that are supposed to be in the first one. I do expect the results to be in different directiories. Is there anything I am missing here?
Versions / Dependencies
Reproduction script
directory 1(results are expected):
directory 2(results are NOT expected as it contains results from directory 1):
Issue Severity
High: It blocks me from completing my task.