Embarcadero / P4D-Data-Sciences

A collection of lightweight Python wrappers based on Python4Delphi simplifying Data Sciences development with Delphi
MIT License
88 stars 20 forks source link

PyTorch and HTTPS error (Mac) #10

Open peardox opened 2 years ago

peardox commented 2 years ago

This one is a situation that I've come accross before with Lazarus (it was fixed for FPC)

Torch will optionally download models from a centralized repository if required. When it does this it uses Python's urllib3 to obtain the file(s) it needs over HTTPS which results in the mess pictured below....

machttps

I suspect Delphi Python on Mac has the same issue Lazarus had in that it expects openssl - which Apple deprecated some years ago in favour of their own SSL transport layer (hey, it's Apple - what do you expect...)

To replicate all you need to do is to try loading a public model - VGG16 in the case above - and the Mac will go all "Apple" on you.

This situation can be worked around by providing the required models eitehr on-demand via a Delphi download or by including them in the deployment package (they can be big, depending on the model you choose - VGG16/19 are each 500Mb+

As a work-around exists it seems sensible to "patch the developer" rather than the system in a "known issues" section in the Docs/README.md by informing them of this problem and how to work around it.

I'm going to alter the Python my stuff uses to cater for this eventuality and will provide a tech-note with clear instructions.

peardox commented 2 years ago

I've worked out the method to allow Macs to work with models from local copies

Just going to run a model to double check it works and compare to previously generated model (this will take some time - might run it at night)

lmbelo commented 2 years ago

I will try this as soon as possible.

peardox commented 2 years ago

As a note I've identified the specific Python that was triggering this issue so trying to call the following (extracted + untested) will upset all Macs...

from collections import namedtuple
import torch
from torchvision import models
from torchvision.models import VGG16_Weights, VGG19_Weights

class Vgg16(torch.nn.Module):
  def __init__(self, requires_grad=False, vgg_path=None):
    super(Vgg16, self).__init__()
      dies = models.vgg19(weights=VGG19_Weights.IMAGENET1K_V1).features

vgg = Vgg16(requires_grad=False)

But this version replacing the bad one will work as long as vgg_path points at a copy of vgg16-397923af.pth - this file is obtainable by running the bad version on a non-Mac OS - it'll end up in ($HOME)/.cache/torch somewhere then you can copy it to Mac and set vgg_path to it

works = models.vgg16()
works.load_state_dict(torch.load(vgg_path), strict=False)
works_features = vgg_pretrained.features

vgg = Vgg16(requires_grad=False, vgg_path='path to vgg 16 as above')

I've now verified the non-invasive work-around (took 3 hours and 15 mins to train two models on my Laptop's 3060) and I've checked that the code at least functions on my 2013 MacBook using the same conditions (it's not worth running a proper test - it'll take forever).

I'll go rent another M1 Mac Mini for the day (2.50 euros I can just about afford) to try running the same test I just completed while sleeping on an M1.

I'm unsure as to the performance of the M1 but for comparison running with GPU one of my AFK tests took 90 mins while it has an ETA of 32 hours (I've killed the job - just wanted the ETA) if I disable GPU usage. This means that using a GPU boosts processing around 20x (didn't run CPU test long enough for a properly accurate ETA)

As the M1 is supposed to transparently be supported by Torch at least I'll get an idea of the speed compared to my Laptop - the M1 I can rent ATM only has 8Gb so this will only be a rough test.

After I've run the M1 tests I'll document my findings.

All this is pretty academic - I can't build my full project on a Mac anyway (Skia needs Delphi 11 - CE won't work) - same goes for Linux (for different reasons).

peardox commented 2 years ago

Won't install on M1 (fine on X64)

See https://github.com/Embarcadero/PythonEnviroments/issues/1#issuecomment-1195365073