voxel51 / eta

ETA: Extensible Toolkit for Analytics
https://voxel51.com
Apache License 2.0
29 stars 13 forks source link

Inconsistent success/failure when downloading models from Google drive via HTTP #376

Open tylerganter opened 4 years ago

tylerganter commented 4 years ago

Getting the following error when deploying a new sense version:

Downloading model from Google Drive ID '1pC9WX7Ol2cy4ERAcLQ-a2Dd04d91vJxj' to '/Users/tylerganter/source/theta/eta/eta/models/ssd-resnet50-fpn-coco.pb'                                                                                                                         
print(os.path.exists(model_path))                                                                                                                                                                                                                                               

Uncaught exception                                                                                                                                                                                                                                                              
Traceback (most recent call last):                                                                                                                                                                                                                                              
  File "<stdin>", line 1, in <module>                                                                                                                                                                                                                                           
  File "/Users/tylerganter/source/theta/eta/eta/core/models.py", line 180, in download_model                                                                                                                                                                                    
    model.manager.download_model(model_path, force=force)                                                                                                                                                                                                                       
  File "/Users/tylerganter/source/theta/eta/eta/core/models.py", line 1018, in download_model                                                                                                                                                                                   
    self._download_model(model_path)                                                                                                                                                                                                                                            
  File "/Users/tylerganter/source/theta/eta/eta/core/models.py", line 1074, in _download_model                                                                                                                                                                                  
    etaw.download_google_drive_file(gid, path=model_path)                                                                                                                                                                                                                       
  File "/Users/tylerganter/source/theta/eta/eta/core/web.py", line 69, in download_google_drive_file                                                                                                                                                                            
    return sess.write(path, fid) if path else sess.get(fid)                                                                                                                                                                                                                     
  File "/Users/tylerganter/source/theta/eta/eta/core/web.py", line 155, in write                                                                                                                                                                                                
    return WebSession.write(self, path, self.BASE_URL, params={"id": fid})                                                                                                                                                                                                      
  File "/Users/tylerganter/source/theta/eta/eta/core/web.py", line 120, in write                                                                                                                                                                                                
    r = self._get_streaming_response(url, params=params)                                                                                                                                                                                                                        
  File "/Users/tylerganter/source/theta/eta/eta/core/web.py", line 158, in _get_streaming_response                                                                                                                                                                              
    r = WebSession._get_streaming_response(self, url, params=params)                                                                                                                                                                                                            
  File "/Users/tylerganter/source/theta/eta/eta/core/web.py", line 136, in _get_streaming_response                                                                                                                                                                              
    raise WebSessionError("Unable to get '%s'" % url)                                                                                                                                                                                                                           
eta.core.web.WebSessionError: Unable to get 'https://drive.google.com/uc?export=download'    

You can replicate the bug with the following:

import eta.core.web as etaw

gid = "1pC9WX7Ol2cy4ERAcLQ-a2Dd04d91vJxj"
model_path = "/Users/tylerganter/source/theta/eta/eta/models/ssd-resnet50-fpn-coco.pb"

print(os.path.exists(model_path))
etaw.download_google_drive_file(gid, path=model_path)
print(os.path.exists(model_path))

Note that this is only happening for certain models. (I think ones that have been downloaded a lot recently)

It seems to be an issue with too many requests to download the file? I found this article here: https://www.ghacks.net/2017/04/14/fix-google-drive-sorry-you-cant-view-or-download-this-file-error/ and a few others with similar suggestion to replace

https://drive.google.com/uc

with

https://drive.google.com/open

But as @brimoor noticed, this downloads something else that is not the correct file.

tylerganter commented 4 years ago

Closed PR with initial attempt to fix: https://github.com/voxel51/eta/pull/375

brimoor commented 4 years ago

WTF is going on...

import eta.core.models as etam
etam.download_model("ssd-resnet50-fpn-coco", force=True)  # fails
etam.download_model("ssd-mobilenet-v1-coco", force=True)  # works
brimoor commented 4 years ago

FYI the reason that eta.core.web.download_google_drive_file is not a straightforward implementation is that Google Drive shows annoying "large file" warnings when downloading large files that must be handled. See this SO question for the inspiration of my original implementation of this:

https://stackoverflow.com/questions/25010369