Open andreasjansson opened 2 years ago
Probably broken in #696 or #697.
Not sure whether it's this issue specifically, but every time we run the scraper something seems to break, and quite often versions are removed because bits of the scraper is broken.
Like I think we've talked about before, I wonder whether we have a manually created matrix, with a script to aid us adding new things? Will ensure old things don't break.
Investigating this a bit and I don't think this is related to #696 or #697. I think this is a problem with installing both torch and tensorflow together. I reckon that's just not possible right now and it fails silently. A quick fix here would be to simply throw an error if you try to do that? Or just not do CUDA version resolution on Tensorflow, or something.
I can confirm this issue affects the latest cog version (0.6.1) and the tensorflow x pytorch combination.
Works with gpu: false
though.
I believe the issue is here: https://github.com/replicate/cog/blob/1f8fec1a52eb407d4be4271726ce29f46f8e543b/pkg/config/compatibility.go#L260
Not really familiar with go, but could try a PR if anyone is interested. I see no reason for failure if compatible versions are available.
The following cog.yaml will not install any version of tensorflow:
However, when I add
pip install tensorflow==2.8.0
torun:
it works.My guess is that something in the CUDA compatibility logic is breaking.