Closed mlaboss closed 1 year ago
I'm getting exactly the same since last week, but at this point I cant train at all, no matter if I created a new project, that was solution for me (when removing data) if you want to try.
you can also check the .mbconfig file in your project folder (default: MLModel1.mbconfig), in the projects that worked before was like this:
{
"TrainingTime": 2147482,
"Scenario": "ImageClassification",
"DataSource": {
"Type": "Folder",
"Version": 1,
"FolderPath": "C:\\Users\\areal\\source\\repos\\CLAIM"
},
"Environment": {
"Type": "LocalGPU",
"Version": 1
},
"RunHistory": {
"Version": 1,
"Type": "Result",
"Trials": [
{
"Version": 0,
"Type": "Trial",
"TrainerName": "DNN + ResNet50",
"Score": 0.99864406779661019,
"RuntimeInSeconds": 448.02398681640625
}
],
"Pipeline": {
"parameter": {
"0": {
"OutputColumnName": "Label",
"InputColumnName": "Label"
},
"1": {
"LabelColumnName": "Label",
"ScoreColumnName": "Score",
"FeatureColumnName": "ImageSource"
},
"2": {
"OutputColumnName": "PredictedLabel",
"InputColumnName": "PredictedLabel"
}
},
"estimators": [
"MapValueToKey",
"ImageClassificationMulti",
"MapKeyToValue"
]
},
"MetricName": "MicroAccuracy"
},
"Type": "TrainingConfig",
"Version": 2
}
but now Im always getting the following no matter what I do (even on new projects):
{
"Type": "TrainingConfig",
"Version": 2
}
POSTDATA: Im using 2 categories, each with 40K .png images.
@JakeRadMSFT Can you take a look at this?
@mlaboss - if you just re-select the folder or select a different folder and then re-select original folder. Does that resolve the issue?
Another potential work around -
Does closing Model Builder and Re-opening Model Builder resolve this issue?
I'm going to try these work around and test out a potential fix.
We should try to find a better approach for handling folder/file changes.
Related to: #2081
@mlaboss - if you just re-select the folder or select a different folder and then re-select original folder. Does that resolve the issue?
Another potential work around -
Does closing Model Builder and Re-opening Model Builder resolve this issue?
I'm going to try these work around and test out a potential fix.
Selecting a different folder then re-selecting the original folder: Does not resolve the issue, still get the error. I did get a "Changing the folder path will reset training progress, and you will have to re-start training. Would you like to continue?" which I clicked "Yes" on.
Closing Model Builder and re-opening: Does not resolve the issue either.
Also hitting this issue. I've been battling it for 5 or 6 hours now and I can't find a way to work around it. I've cleared temp files (both user and system), reinstalled VS and the extensions, switched between VS2019 and VS2022, rebooted, deleted the models and started over, moved the directory, tried rolling back extension versions, etc. and no matter what I do it appears to be stuck with this exact same exception every time I try to train a model. This is a sev0 blocker IMO.
@gsuberland Did you do the exact same thing as the other user? Removed some files after training? Or are you just getting a similar message?
@beccamc The problem arose after I moved the training data to a different directory. The displayed error and stack trace are identical to what was reported with this issue, but I now suspect that it's a different underlying bug. I've opened up a new issue at #2102 in relation.
The error message is definitely a UX bug, since it doesn't actually tell you what the problem was beyond "something went wrong in the training process", and that isn't immediately clear from the exception message or the stack trace.
Same as @gsuberland. The problem arose every time when I edit the training data. It can be fixed by renaming of the training data folder and reselecting it on the "Data" page.
@JakeRadMSFT to take a look and get into May release.
Action plan:
I have confirmed that Jake's fix of the reload button works. This will ship in the Fall Release.
Model Builder Version: 16.13.1.2210302 Visual Studio Version: 2022 (17.1.0)
Bug description After removing some training images from an already-trained image classification model, attempting to re-train the model results in a "filePath cannot be null or empty" error.
Steps to Reproduce
Expected Experience Re-training the model works as normal.
Actual Experience A Model Builder Error dialog pops up.
Log file attached: LiquidIn24WellPlateModel-OKTSP1.txt
Additional Context It seems like the model builder is looking for the removed file and having a problem when it can't find it.
I am able to work around the problem by creating a brand new folder with a different name than the old folder, putting the training images in there, and pointing the model builder at that folder instead.