Closed boneatjp closed 4 months ago
@LittleLittleCloud can you take a look at this?
Do you mean if I could look at the site "@LittleLittleCloud"? If I click the link, it shows "LittleLittleCloud (Xiaoyun Zhang)/ January 2024" and what should I take a look at? Sorry, I don't get the point you mean.
@boneatjp He's asking me to take a look
@boneatjp Can you share the log from MLContext? You can get log from context by attaching a event listner
MLContext context;
context.Log += (o, e) => {
Console.WriteLine(e)
}
Since it's windows form application, I changed as following:
string logContext = "";
contxt.Log += (o, e) => {
logContext += e + Environmet.NewLine;
};
and after finishing RunAsync method,
File.AppendAllText("logContext.txt", logContext);
the file "logContextWv0201.txt" is when running Microsoft.ML.AutoML version 0.20.1 and the file "logContextWv0211.txt" is when running Microsoft.ML.AutoML version 0.21.1.
I'm not sure if I'm doing the way you wanted or not. The log shows so many "Microsoft.ML.LoggingEventArgs" logContextWv0201.txt logContextWv0211.txt
@boneatjp Can you print the message instead?
string logContext = "";
contxt.Log += (o, e) => {
logContext += e.Message + Environmet.NewLine;
};
logContextWv0201.txt logContextWv0211.txt
These are the modified logs.
@boneatjp
Thanks.
It seems that in both log files, the cancellation token is invoked. In the first log file, the trainer for the current running trial is SDCA, while in the second log file, the trainer for the running trial is LightGBM.
In that situation, it's somehow expected that the cancellation "not work" for the second situation. This is because SDCA trainer is implemented in managed code, and during training it will check cancellation token periodically and pause traininng when the token get cancelled. However, LightGBM trainer is implemented in native code, so the cancellation token can only be checked once the native code execution is completed, which might make it look like the cancellation
button doesn't react if the native code execution takes some time to completed.
So maybe you can disable LightGBM trainer or update the UI to present a cancelling
status when cancellation btn
is clicked until the current experiment get cancelled?
@LittleLittleCloud
Thank you for checking the logs. However, I guess I could not explain the problem I'm having at the first time. With version 0.20.1, the monitor works fine. With version 0.21.1, the monitor does not work properly. While trials are running, windows controls cannot handle events such as btnCancel_Click. Like when looping without Application.DoEvent(). I hope you could get the point; with version 0.20.1, the monitor imprements Application.DoEvent() but not with version 0.21.1.
@boneatjp I'm not really understand about the windows controls can't handle btnCancel_Click
, Because from the log you present, both cancellation token are invoked. are you saying after updating to 0.21.1
IMonitor
doesn't show completed trialIMonitor
doesn't show running trial@LittleLittleCloud I think the logs show the event that terminating by running out of time I set to 5 minutes not from btnCancel_Click. I'm saying that after updating to 0.21.1 or 0.21.0, I cannot even click the buttun nor change the size of the application nor any other things to the application.
@boneatjp are you saying your app deadlocked
when clicking training btn after updating AutoML?
@LittleLittleCloud I'm not sure if "it's deadlocked" is the right way of explainning it, but that how it is so I leave it until it finishes training.
@boneatjp That sounds wield, could you provide a minimal reproducible example, or provide a link to the code.
@LittleLittleCloud I've made this project with Microsoft.ML.AutoML version 0.20.1.
It should work fine, I guess. But, if you upgrade to Microsoft.ML.AutoML version 0.21.1, you should see what I mean.
in Form1.cs, changing the last few lines of button1_Click
from
await experiment.RunAsync(cts.Token);
button1.Enabled = true;
button2.Enabled = false;
richTextBox1.AppendText(Environment.NewLine +"Training Finished!!" + Environment.NewLine);
to
_ = Task.Run(async () => {
await experiment.RunAsync(cts.Token);
button1.Enabled = true;
button2.Enabled = false;
richTextBox1.AppendText(Environment.NewLine +"Training Finished!!" + Environment.NewLine);
});
Introduced by #6560, the RunAsync
in SweepablePipelineRunner
will not actually start the trial in a new task. It simply wrap the trial result in a task object using Task.FromResult
.
In your code, this change means that the automl experiment will block and freeze UI thread. But the root cause is not actually in the monitor's code.
The fix is simply put the automl experiment in a new task so it won't block UI.
@LittleLittleCloud OK, I've modified my source code as you suggested. Then tried to run, well, it gets an error 'System.InvalidOperationException' in the IMonitor where outputting logs to RichTextBox.
Since it worked fine with version 0.20.1, there must have been changes with version 0.21.1 to use IMonitor. Are you saying that I've got change my source code to use with version 0.21.1?
I really appreciate your support showing how I could go around with version 0.21.1. I guess I have to learn more writing code in C#. I've read something about accessing controls from other tasks. However, I have not understand how I could do without getting errors.
@boneatjp when you talk about source code, are you saying the code you shared in the zip file above, or the actual code in your project.
Since it worked fine with version 0.20.1, there must have been changes with version 0.21.1 to use IMonitor. Are you saying that I've got change my source code to use with version 0.21.1?
Yes, after the change above, I can actually run the project you share above with 0.21.1. So if the source code is the zip file you share above, that would make me confuse.
@LittleLittleCloud Well, I'm saying that the project I uploaded here and changed as you mensioned but I'm having an error, but you're saying you're not having any errors at all by modifying the code you mention here?
@boneatjp OK, maybe I miss mention some other changes I made. I push the entire project with changes to github. Maybe that would help
@LittleLittleCloud Thank you so much! I've fixed my app to run with version 0.21.1.
Though the project you pushed to github still had the same error I was getting which was due to the problem accessing controls from other tasks, I've managed using Control.Invoke in IMonitor class.
Quite difference between version 0.20.1 and version 0.21.1, I think. But, I'm so grad I could use version 0.21.1 with your help.
Cool, glad you figured out! Pls let us know if you need any further help.
System Information (please complete the following information):
Describe the bug Writing windows form application using ML.NET 3.0.1:
NuGet Microsoft.ML Version 3.0.1 Microsoft.ML.AutoML Version 0.20.1 Microsoft.ML.CpuMath Version 3.0.1 Microsoft.ML.DataView Version 3.0.1 Microsoft.ML.FastTree Version 3.0.1 Microsoft.ML.LightGbm Version 3.0.1 Microsoft.ML.Mkl.Components Version 3.0.1 Microsoft.ML.Mkl.Redist Version 3.0.1
To Reproduce Installing Microsoft.ML.AutoML Version 0.21.0 or 0.21.1, my Monitor class does not behave the way it does with Microsoft.ML.AutoML Version 0.20.1.
Expected behavior By clicking btnCancel, experiment stops. It works fine with version 0.20.1, but would not work with version 0.21.0 or 0.21.1.