dotnet / TorchSharp

A .NET library that provides access to the library that powers PyTorch.
MIT License
1.39k stars 180 forks source link

Loading pre-trained models (`torch.hub.load_state_dict_from_url()` or `torch.hub.load()`) #713

Open kaiidams opened 2 years ago

kaiidams commented 2 years ago

@NiklasGustafsson @ericstj @GeorgeS2019

Probably people want to have a place to download pre-trained model files so that they can fine-tune pre-made models by TorchVision and TorchAudio without converting files themselves. We need likes of torch.hub and a public storage.

NiklasGustafsson commented 2 years ago

Adding @luisquintanilla to the discussion -- this is something we need to address for ML.NET in general.

yueyinqiu commented 6 months ago

I'm wondering why it seems to be stopped here. Has this been solved, or maybe are there some commercial problems?

Would it be acceptable to upload the converted models to some platforms like hugging face in the name of an individual?

NiklasGustafsson commented 6 months ago

The reason is simple -- in the general case, you would want to have the ability to associate model weights with code, such as a NuGet package. Model weights are really like any other binary resource (to use the .NET name), just really big. That size is the problem:

  1. NuGet Gallery won't accept packages larger than 250MB, so that excludes a lot of weights files. The weights can't just be embedded in the package, they have to reside outside.

  2. We can provide a "standard" space for model weights, but someone will have to pay for it. That makes it hard.

  3. We can make it the developer's problem -- find some location to store weights, paid or free. We already have this capability, and @shaltielshmid's TorchSharp extension library has expanded the model weight formats that can be supported.

shaltielshmid commented 6 months ago

One options that might be possible - if we are using PyBridge to load in the weights, maybe we can just link directly to the storage that PyTorch uses and then we don't have to worry about that.

@NiklasGustafsson What do you think? Is this possible?

I will note that I haven't researched how this works behind the scenes, so my answer is completely theoretical.

GeorgeS2019 commented 6 months ago

@NiklasGustafsson https://github.com/dotnet/TorchSharp/discussions/1277#discussion-6438105

Use huggingface to store

NiklasGustafsson commented 6 months ago

One options that might be possible - if we are using PyBridge to load in the weights, maybe we can just link directly to the storage that PyTorch uses and then we don't have to worry about that.

The challenge is what @yueyinqiu called 'a public storage' -- who pays? We should make it easy to download weights in a standardized manner, but (for now), we have to make it up to package (model) developers to decide where to store the weights publicly.

shaltielshmid commented 6 months ago

Right. I guess my response is regarding the linked issue with downloading the weights to a pretrained model (vgg) from the torch hub. Maybe we can integrate using the already uploaded weights from the hub?

NiklasGustafsson commented 6 months ago

You mean like we do in torchvision.models but automatically finding the weights files?

shaltielshmid commented 6 months ago

Yes

I'm not sure if that's the place to implement it since it would require PyBridge to be installed, which would be another dependency

NiklasGustafsson commented 6 months ago

No, it would go elsewhere.

GeorgeS2019 commented 6 months ago

@NiklasGustafsson @shaltielshmid

Could one of you comment of the feasibility of using HuggingFace?

image

https://huggingface.co/williamlzw/stable-diffusion-1-5-torchsharp

image

Ref: https://github.com/dotnet/TorchSharp/discussions/1277#discussion-6438105

GeorgeS2019 commented 6 months ago

Hugging Face provides an interface that allows you to export Transformers models to TorchScript, making it possible to reuse these models in environments other than PyTorch-based Python programs.

Now TorchSharp support TorchScript: Is HuggingFace the right place for TorchSHarp to support TorchScript??

GeorgeS2019 commented 6 months ago

@williamlzw

Could you share your experience of sharing TorchSharp model through HuggingFace.

NiklasGustafsson commented 6 months ago

Huggingface is absolutely feasible. TorchSharp should not standardize on HF as the one place to store weights. It should be up to model developers to decide where to store weights.

From: GeorgeS2019 @.> Date: Tuesday, April 16, 2024 at 11:15 To: dotnet/TorchSharp @.> Cc: Mention @.>, Comment @.>, Subscribed @.***> Subject: Re: [dotnet/TorchSharp] Loading pre-trained models (torch.hub.load_state_dict_from_url() or torch.hub.load()) (Issue #713)

@NiklasGustafssonhttps://github.com/NiklasGustafsson @shaltielshmidhttps://github.com/shaltielshmid

Could one of you comment of the feasibility of using HuggingFace?

image.png (view on web)https://github.com/dotnet/TorchSharp/assets/49812372/95e5a374-4c78-4fda-ab3f-2a2f2ca51dd4

https://huggingface.co/williamlzw/stable-diffusion-1-5-torchsharp

image.png (view on web)https://github.com/dotnet/TorchSharp/assets/49812372/eec55d55-2358-4e6e-9527-9975d6f6618a

Ref: #1277 (comment)https://github.com/dotnet/TorchSharp/discussions/1277#discussion-6438105

— Reply to this email directly, view it on GitHubhttps://github.com/dotnet/TorchSharp/issues/713#issuecomment-2059677049 or unsubscribehttps://github.com/notifications/unsubscribe-auth/AANVMB32F2U4RHYLH4D26HLY5VTCNBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWWES43TOVSUG33NNVSW45FGORXXA2LDOOJIFJDUPFYGLKTSMVYG643JORXXE6NFOZQWY5LFVEYTKMRXG4YDIMBXQKSHI6LQMWSWS43TOVS2K5TBNR2WLKRRGM2TONBXGMYDEOFHORZGSZ3HMVZKMY3SMVQXIZI. You are receiving this email because you were mentioned.

Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

yueyinqiu commented 6 months ago

Perhaps I have misunderstood this issue. Actually letting developers to store and provide the weights is what PyTorch does, which also make sense for us, because everyone could create their model and we can't know all of them.

Meanwhile, since we cannot resolve hubconf.py without a python environment, if we want to implement torch.hub in TorchSharp, the only solution is to create a completely new protocol, just like our .dat, which is almost impossible and also not that necessary. It still cannot load PyTorch models, and may even cause a misleading. So perhaps no more things should to be added into our torch.hub. I suppose it will be fine to just use download_url_to_file and then someModel.load.

I just commented on this issue because of #1265, which is asking for a pertrained model provided by PyTorch. So for other models, perhaps we should let the developers to decide whether to provide their model in .dat format, but PyTorch won't do that forever, I suppose. So if we want to keep consist with it as much as possible, we should provide the converted version of all the pertained models by PyTorch.

I could understand that using hugging face as the "standard" space is not that proper. At least we shouldn't hard code a hugging face link inside TorchSharp. So perhaps the best solution for now is to upload the converted models to hugging face and create another project which provides a simplified way to load those models, in the name of an individual rather than TorchSharp.

yueyinqiu commented 6 months ago

Meanwhile I failed to load the vgg16 model with current PyBridge. It seems that it is using an "old PyTorch format".


And if we want to do that with PyBridge, perhaps this should also be noticed:

# hub.py

# Hub used to support automatically extracts from zipfile manually compressed by users.
# The legacy zip format expects only one file from torch.save() < 1.6 in the zip.
# We should remove this support since zipfile is now default zipfile format for torch.save().
def _is_legacy_zip_format(filename: str) -> bool:
    if zipfile.is_zipfile(filename):
        infolist = zipfile.ZipFile(filename).infolist()
        return len(infolist) == 1 and not infolist[0].is_dir()
    return False

def _legacy_zip_load(filename: str, model_dir: str, map_location: MAP_LOCATION, weights_only: bool) -> Dict[str, Any]:
    warnings.warn('Falling back to the old format < 1.6. This support will be '
                  'deprecated in favor of default zipfile format introduced in 1.6. '
                  'Please redo torch.save() to save it in the new zipfile format.')
    # Note: extractall() defaults to overwrite file if exists. No need to clean up beforehand.
    #       We deliberately don't handle tarfile here since our legacy serialization format was in tar.
    #       E.g. resnet18-5c106cde.pth which is widely used.
    with zipfile.ZipFile(filename) as f:
        members = f.infolist()
        if len(members) != 1:
            raise RuntimeError('Only one file(not dir) is allowed in the zipfile')
        f.extractall(model_dir)
        extraced_name = members[0].filename
        extracted_file = os.path.join(model_dir, extraced_name)
    return torch.load(extracted_file, map_location=map_location, weights_only=weights_only)
GeorgeS2019 commented 6 months ago

@yueyinqiu

Thx for helping us to check, what works what fails.

williamlzw commented 6 months ago

https://github.com/kjsman/stable-diffusion-pytorch I transplanted the torchsharp version of the sd model by referring to the code in this repository.

yueyinqiu commented 6 months ago

I have uploaded all the weights in torchvision.models here with this.

Not checked for all models, but at least the first number of the first parameter of vgg16 could be loaded:

using TorchSharp;

var huggingFace = "https://huggingface.co/";

var file = new FileInfo("./vgg16.dat");
if (!file.Exists)
    torch.hub.download_url_to_file(
        $"{huggingFace}yueyinqiu/vision-TorchSharp/resolve/main/VGG16_Weights.IMAGENET1K_V1",
        file.FullName);

var vgg1 = torchvision.models.vgg16();
var vgg2 = torchvision.models.vgg16(weights_file: file.FullName);

Console.WriteLine(vgg1.parameters().First().data<float>().First()); // seems to be a random value
Console.WriteLine(vgg2.parameters().First().data<float>().First()); // always -0.5537306
williamlzw commented 6 months ago

@yueyinqiu You need to ensure that the key name of the torchsharp weight is exactly the same as the pytorch weight key name, and the weight value size and type must match.

In order to facilitate the verification of weights, you can extract the sub-models of a model and verify them separately, such as backbone, head or even a single layer.

torchsharp

var model = torchvision.models.vgg16(); foreach (var index in model.state_dict()) { Console.WriteLine($"{index.Key},{string.Join(",", index.Value.shape)}"); }

pytorch model = torchvision.models.vgg16() for (key,value) in model.state_dict().items(): print(key, value.shape)

yueyinqiu commented 6 months ago

@williamlzw I suppose they are. Did you meet any problem when loading it?

williamlzw commented 6 months ago

How do you convert these weights? For example VGG16_Weights.IMAGENET1K_V1.

yueyinqiu commented 6 months ago

I just use weight.get_state_dict (where weight is something like torchvision.models.VGG16_Weights.IMAGENET1K_V1) to download the PyTorch state dictionary, and then use exportsd.save_state_dict to save it in TorchSharp format. Full codes are here.

And it's @kaiidams and @NiklasGustafsson that wrote the models, whose the key names are kept same as the ones in PyTorch. #703 #595 #759. (For vgg, it was first created in this commit.)

williamlzw commented 6 months ago

I tried it and it can be loaded.

      c#
      public static void test_tensor()
      {
          var vgg1 = torchvision.models.vgg19();
         // foreach(var key in vgg1.state_dict().Keys)
          //{
           //   Console.WriteLine(key);
         // }
          vgg1.load("C:\\Users\\Administrator\\Desktop\\vgg19.dat");

      }

      pytorch
      import exportsd
      from torchvision.models import vgg19

      model = vgg19()
      # for (k, v) in model.state_dict().items():
      #     print(k)

      with open(f"./vgg19.dat", "wb") as file:
          exportsd.save_state_dict(model.state_dict(), file)
williamlzw commented 6 months ago

There is no problem with vgg19, but there is indeed a problem with vgg16 loading. @NiklasGustafsson

yueyinqiu commented 6 months ago

@williamlzw How could I know whether it is correctly loaded? I have checked the first parameter and it seems to be successfully loaded. Sorry it might be a silly question but I really don't know about that.

williamlzw commented 6 months ago

One way is to rewrite the sequence layer of the pytorch model, split it into individual layers, and then convert it into torch sharp weights.