dotnet / machinelearning-modelbuilder

Simple UI tool to build custom machine learning models.
Creative Commons Attribution 4.0 International
264 stars 56 forks source link

Visual Studio 2022 Preview (Image classification does not work with local GPU) #1751

Closed enjoy8bit closed 2 years ago

enjoy8bit commented 3 years ago

Model Builder (Preview) Version: 16.6.2.2132901 Model Builder GPU Support Version (Preview): 16.6.3.2136603 Visual Studion Version: 2022 Version 17.0.0 Preview 3.1

Bug description I cant run any local GPU image classification training, because the compatibility check always fails.

image

It says, GPU extension is missing. - When i want to install it, it says it is already installed. Can you help me on this, or is this simply not working on VS 2022?

LittleLittleCloud commented 3 years ago

The GPU check fails because the version of GPU extension you have is NEWER than the version of MB extension, you can upgrade your MB extension to 16.6.3 in order to fix this problem

enjoy8bit commented 3 years ago

Ok, but it says all my stuff is up 2 date: image

image

Any ideas?

beccamc commented 3 years ago

You can download the newest build from the marketplace https://marketplace.visualstudio.com/items?itemName=MLNET.07

LittleLittleCloud commented 3 years ago

@beccamc

The binary in that link is for VS2019, I don't think we release model builder for VS2022 to the public yet and that's why @enjoy8bit can't upgrade it through extension manager. Do we still have the channel for 16.6.3?

enjoy8bit commented 3 years ago

image

Yup, i installed that specific VS19 version. - But in VS22 i still have the issue.

beccamc commented 3 years ago

I don't know if that channel is still there. @enjoy8bit We are shipping an official build for VS 2022 in the next two weeks.

@LittleLittleCloud Will we update the Marketplace build before the release on the 15th?

LittleLittleCloud commented 3 years ago

@beccamc I don't think so, but maybe we could share v16.7.3 (the one inserted in VS2022 preview4) to unblock @enjoy8bit? It should contain most of the fixes for the next release

enjoy8bit commented 3 years ago

Hello, any updates?

beccamc commented 3 years ago

You can download Preview 4 here. An updated Model Builder version should be built in. https://devblogs.microsoft.com/visualstudio/visual-studio-2022-preview-4-is-now-available/

enjoy8bit commented 3 years ago

I updated to preview 4. I still get this message: image

So where do i get correct version of ML.NET Model Builder GPU Support for VS2022 ? - Many thanks in advance

LittleLittleCloud commented 3 years ago

After updating your VS 2022 to preview 4, The version of your ModelBuilder extension should be 16.7.3.2143002. So you can just update your GPU extension (which was 16.6.3.2136603 according to previous conversation) to the recent updated one (16.7.3.2143002) in the market place and everything should work.

LittleLittleCloud commented 3 years ago

After updating your VS 2022 to preview 4, The version of your ModelBuilder extension should be 16.7.3.2143002. So you can just update your GPU extension (which was 16.6.3.2136603 according to previous conversation) to the recent updated one (16.7.3.2143002) in the market place and everything should work.

Just found out that the GPU extension on the market place is for VS2019 only, sorry for the inconvenience.

@beccamc Didl we release GPU extension for VS2022. If it's not released yet, maybe we can share a download link to unpublished build to unblock @enjoy8bit ?

enjoy8bit commented 3 years ago

I am starting see a pattern or a Déjà vu in here :) - Anyway thank you for your support. Hoping for a release

pierre01 commented 2 years ago

Sucky support for ML.NET, Why can't it be at least backward compatible on a technology that isn't even 2 years old.

beccamc commented 2 years ago

Hi everyone - update 2022 builds are out.

https://marketplace.visualstudio.com/items?itemName=MLNET.ModelBuilder2022 https://marketplace.visualstudio.com/items?itemName=MLNET.ModelBuilderGPU2022

enjoy8bit commented 2 years ago

Hello, thank you for those updates. - Unfortunately, this still does not seem to solve the origin issue.

Visual Studio Version: image

Model Builder Version: image

Model Builder GPU Support Version: image

I have updated, the extensions and also Visual Studio to the latest versions,... but when i want to setup an image classification with local GPU support, it still says, that the GPU extension is missing when i click on the "Check compatibility" button: image

Do you guys have any ideas how we/i can solve that issue?

Many thanks in advance

pierre01 commented 2 years ago

Actually I received an update yesterday from Becca McHenry (see below) And I was able to train my model. It ran and created a zip file but ended showing an error message at the end of the first run. So I see progress. Thanks!! Pierre

from Becca McHenry

Hi everyone - update 2022 builds are out.

https://marketplace.visualstudio.com/items?itemName=MLNET.ModelBuilder2022 https://marketplace.visualstudio.com/items?itemName=MLNET.ModelBuilderGPU2022

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/dotnet/machinelearning-modelbuilder/issues/1751#issuecomment-944451428, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAT3SF7G777SS457NT33HKTUHBMHXANCNFSM5DEWLKYA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

From: Hannsen @.> Sent: Monday, October 18, 2021 03:16 To: dotnet/machinelearning-modelbuilder @.> Cc: Pierre Huguet @.>; Comment @.> Subject: Re: [dotnet/machinelearning-modelbuilder] Visual Studio 2022 Preview (Image classification does not work with local GPU) (#1751)

Hello, thank you for those updates. - Unfortunately, this still does not seem to solve the origin issue.

Visual Studio Version: [image]https://user-images.githubusercontent.com/4288419/137711982-6ed7b649-0687-43d6-addf-5cb61d1c4228.png

Model Builder Version: [image]https://user-images.githubusercontent.com/4288419/137711825-9979b7a4-dc29-44c6-880f-df67e579e6c9.png

Model Builder GPU Support Version: [image]https://user-images.githubusercontent.com/4288419/137711915-08142d91-d193-4a12-9849-77e63666cb56.png

I have updated, the extensions and also Visual Studio to the latest versions,... but when i want to setup an image classification with local GPU support, it still says, that the GPU extension is missing when i click on the "Check compatibility" button: [image]https://user-images.githubusercontent.com/4288419/137712305-c3c8a82a-c1c7-4a39-aeed-af4e282b7570.png

Do you guys have any ideas how we/i can solve that issue?

Many thanks in advance

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/dotnet/machinelearning-modelbuilder/issues/1751#issuecomment-945619020, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAT3SF6PKBUKKATGUPY2VETUHPXWRANCNFSM5DEWLKYA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

beccamc commented 2 years ago

@enjoy8bit Your Model Builder version and GPU version need to match. You need to update your Model Builder version to 16.7.6.2150501. That version should be available for download here - https://marketplace.visualstudio.com/items?itemName=MLNET.ModelBuilder2022

image

Are you having any problems installing it?

enjoy8bit commented 2 years ago

Heureka. - Thank you very much. - It is working now :)

enjoy8bit commented 2 years ago

I can run one iteration, and at the end if fails with the following information: image

image

Maybe you guys, can have a look on this? - Many thanks meanwhile :) Maybe this is the same error what pierre mentioned.

I also noticed, that my GPU worked max with like 15-20 %. - Is there a limit within ML.NET application?

pierre01 commented 2 years ago

I had the exact same issue!

LittleLittleCloud commented 2 years ago

@pierre01 @enjoy8bit

Looks like a bug in re-entrance... Could you try creating a new model builder for image classifcation and see if error goes? Or instead, you could open your .mbconfig file as json and add a FolderPath property with full path to your image folder in FolderSource

"TrainingTime": 2147482,
  "Scenario": "ImageClassification",
  "DataSource": {
    "Type": "Folder",
    "Version": 1,
    "FolderPath": "add/your/full/path/to/folder/here"
  },

@zewditu Could you take a look at this? The exception is thrown when sending telemetry,, which shouldn't. Telemetry should never send exception and FolderPath should never missing in DataSource...

LittleLittleCloud commented 2 years ago

@enjoy8bit

Image classification use default setting for batch (which is 10 or 20 I think) and if you have a very decent GPU it's likely that your GPU work not max at 100%.

Meanwhile, it's also possible that you are still training on CPU.. You can go check GPU memory usage to confirm if it's your situation. (If GPU memory usage doesn't spike during training, it's likely that you are still training on CPU)

enjoy8bit commented 2 years ago

@pierre01 @enjoy8bit

Looks like a bug in re-entrance... Could you try creating a new model builder for image classifcation and see if error goes? Or instead, you could open your .mbconfig file as json and add a FolderPath property with full path to your image folder in FolderSource

"TrainingTime": 2147482,
  "Scenario": "ImageClassification",
  "DataSource": {
    "Type": "Folder",
    "Version": 1,
    "FolderPath": "add/your/full/path/to/folder/here"
  },

@zewditu Could you take a look at this? The exception is thrown when sending telemetry,, which shouldn't. Telemetry should never send exception and FolderPath should never missing in DataSource...

image

The full path seems to already be used here.

enjoy8bit commented 2 years ago

@enjoy8bit

Image classification use default setting for batch (which is 10 or 20 I think) and if you have a very decent GPU it's likely that your GPU work not max at 100%.

Meanwhile, it's also possible that you are still training on CPU.. You can go check GPU memory usage to confirm if it's your situation. (If GPU memory usage doesn't spike during training, it's likely that you are still training on CPU)

CPU Training: CPU_Training

GPU Training: GPU_Training

When i train on CPU, i can tell, that it is using quiet good of CPU resources. But on GPU, it seems GPU-memory is not getting used. - GPU percentage is low, and it still says like 7 of 24 GB GPU-RAM are used.

LittleLittleCloud commented 2 years ago

@enjoy8bit

It's still uses 7GB of GPU memory, which is similar with the GPU memory usage on my PC. So it might be because your GPU is quite decent and the default batch is too small for your RTX 2080.

enjoy8bit commented 2 years ago

@enjoy8bit

It's still uses 7GB of GPU memory, which is similar with the GPU memory usage on my PC. So it might be because your GPU is quite decent and the default batch is too small for your RTX 2080.

On the following screenshot, i am not running a training: image

And it still says like ~ 7GB of GPU-Ram. - So currently for me, it is like it does not use GPU-Memory on training at all. I am lacking experience on this topic. - All i can to is provide you data, which might lead to some fixes.

Of course, i am not excluding, that the problem is on my side, and i still dont understand it yet.

LittleLittleCloud commented 2 years ago

It might be the training process is not killed. Can you kill the training process and see if GPU memory is released? The training process should have name "ServiceHub.Host.Clr.x64"

image

enjoy8bit commented 2 years ago

It might be the training process is not killed. Can you kill the training process and see if GPU memory is released? The training process should have name "ServiceHub.Host.Clr.x64"

image

You were, right: image

I closed Visual Studio, and resources were getting released.

beccamc commented 2 years ago

Is everyone on this thread now able to train?

pierre01 commented 2 years ago

No: mlnetErr01

And when I trey to reinstall the

ML.NET Model Builder GPU Support 2022

I get: MLNerErr02

From: Becca McHenry @.> Sent: Friday, October 29, 2021 09:59 To: dotnet/machinelearning-modelbuilder @.> Cc: Pierre Huguet @.>; Mention @.> Subject: Re: [dotnet/machinelearning-modelbuilder] Visual Studio 2022 Preview (Image classification does not work with local GPU) (#1751)

Is everyone on this thread now able to train?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/dotnet/machinelearning-modelbuilder/issues/1751#issuecomment-954899922, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAT3SFZMZQJQ6QGM5O5V7B3UJLHE5ANCNFSM5DEWLKYA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

beccamc commented 2 years ago

Sorry @pierre01 I can't see the image or error you included after "I get:"

pierre01 commented 2 years ago

Sorry @pierre01 I can't see the image or error you included after "I get:"

I modified the images in my comment above.

You can ignore my comment as the latest version has fixed this issue .

beccamc commented 2 years ago

HI all, closing this as I believe the issue is resolved. If anyone encounters this, make sure that GPU and Model Builder extension versions match.

If there are any other GPU issues please re-open or file a new issue. Thanks!