marieai / marie-ai

Integrate AI-powered Document Analysis Pipelines
MIT License
57 stars 3 forks source link

certificate verify failed: unable to get local issuer certificate #99

Open gregbugaj opened 8 months ago

gregbugaj commented 8 months ago

Describe the bug During startup we are getting following exception.

INFO   marie@33 Kwargs : {'metas': {'name': '', 'description': '', 'workspace': '', 'py_modules': ['marie.executor.text']}, 'requests': {}, 'dynamic_batching':                     
       {}, 'runtime_args': {'workspace': None, 'shard_id': 0, 'shards': 1, 'replicas': 3, 'name': 'extract_t/rep-2', 'provider': [<ProviderType.NONE: 0>],                          
       'metrics_registry': <prometheus_client.registry.CollectorRegistry object at 0x7fd73086ce50>, 'tracer_provider': None, 'meter_provider': None}}                               
ERROR  extract_t/rep-1@32 SSLError(MaxRetryError("HTTPSConnectionPool(host='dl.fbaipublicfiles.com', port=443): Max retries exceeded with url:                                      
       /fairseq/gpt2_bpe/vocab.bpe (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get                       
       local issuer certificate (_ssl.c:1007)')))")) during 'WorkerRuntime' initialization                                                                                          
        add "--quiet-error" to suppress the exception details                                                                                                                       
       Traceback (most recent call last):                                                                                                                                           
         File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 715, in urlopen                                                                              
           httplib_response = self._make_request(                                                                                                                                   
         File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 404, in _make_request                                                                        
           self._validate_conn(conn)                                                                                                                                                
         File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1058, in _validate_conn                                                                      
           conn.connect()                                                                                                                                                           
         File "/opt/venv/lib/python3.10/site-packages/urllib3/connection.py", line 419, in connect                                                                                  
           self.sock = ssl_wrap_socket(                                                                                                                                             
         File "/opt/venv/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket                                                                           
           ssl_sock = _ssl_wrap_socket_impl(                                                                                                                                        
         File "/opt/venv/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl                                                                     
           return ssl_context.wrap_socket(sock, server_hostname=server_hostname)                                                                                                    
         File "/usr/lib/python3.10/ssl.py", line 513, in wrap_socket                                                                                                                
           return self.sslsocket_class._create(                                                                                                                                     
         File "/usr/lib/python3.10/ssl.py", line 1071, in _create                                                                                                                   
           self.do_handshake()                                                                                                                                                      
         File "/usr/lib/python3.10/ssl.py", line 1342, in do_handshake                                                                                                              
           self._sslobj.do_handshake()                                                                                                                                              
       ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)                               

       During handling of the above exception, another exception occurred:                                                                                                          

       Traceback (most recent call last):                                                                                                                                           
         File "/opt/venv/lib/python3.10/site-packages/requests/adapters.py", line 486, in send                                                                                      
           resp = conn.urlopen(                                                                                                                                                     
         File "/opt/venv/lib/python3.10/site-packages/urllib3/connectionpool.py", line 799, in urlopen                                                                              
           retries = retries.increment(                                                                                                                                             
         File "/opt/venv/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment                                                                                
           raise MaxRetryError(_pool, url, error or ResponseError(cause))                                                                                                           
       urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='dl.fbaipublicfiles.com', port=443): Max retries exceeded with url:                                               
       /fairseq/gpt2_bpe/vocab.bpe (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get                       
       local issuer certificate (_ssl.c:1007)')))                                                                                                                                   

       During handling of the above exception, another exception occurred:                                                                                                          

       Traceback (most recent call last):                                                                                                                                           
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/executors/run.py", line 143, in run                                                                               
           runtime = AsyncNewLoopRuntime(                                                                                                                                           
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/asyncio.py", line 93, in __init__                                                                        
           self._loop.run_until_complete(self.async_setup())                                                                                                                        
         File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete                                                                                         
           return future.result()                                                                                                                                                   
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/asyncio.py", line 310, in async_setup                                                                    
           self.server = self._get_server()                                                                                                                                         
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/asyncio.py", line 215, in _get_server                                                                    
           return GRPCServer(                                                                                                                                                       
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/servers/grpc.py", line 34, in __init__                                                                   
           super().__init__(**kwargs)                                                                                                                                               
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/servers/__init__.py", line 63, in __init__                                                               
           ] = (req_handler or self._get_request_handler())                                                                                                                         
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/servers/__init__.py", line 88, in _get_request_handler                                                   
           return self.req_handler_cls(                                                                                                                                             
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/worker/request_handling.py", line 140, in __init__                                                       
           self._load_executor(                                                                                                                                                     
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/runtimes/worker/request_handling.py", line 377, in _load_executor                                                 
           self._executor: BaseExecutor = BaseExecutor.load_config(                                                                                                                 
         File "/opt/venv/lib/python3.10/site-packages/marie/jaml/__init__.py", line 792, in load_config                                                                             
           obj = JAML.load(tag_yml, substitute=False, runtime_args=runtime_args)                                                                                                    
         File "/opt/venv/lib/python3.10/site-packages/marie/jaml/__init__.py", line 174, in load                                                                                    
           r = yaml.load(stream, Loader=get_jina_loader_with_runtime(runtime_args))                                                                                                 
         File "/opt/venv/lib/python3.10/site-packages/yaml/__init__.py", line 81, in load                                                                                           
           return loader.get_single_data()                                                                                                                                          
         File "/opt/venv/lib/python3.10/site-packages/yaml/constructor.py", line 51, in get_single_data                                                                             
           return self.construct_document(node)                                                                                                                                     
         File "/opt/venv/lib/python3.10/site-packages/yaml/constructor.py", line 55, in construct_document                                                                          
           data = self.construct_object(node)                                                                                                                                       
         File "/opt/venv/lib/python3.10/site-packages/yaml/constructor.py", line 100, in construct_object                                                                           
           data = constructor(self, node)                                                                                                                                           
         File "/opt/venv/lib/python3.10/site-packages/marie/jaml/__init__.py", line 582, in _from_yaml                                                                              
           return get_parser(cls, version=data.get('version', None)).parse(                                                                                                         
         File "/opt/venv/lib/python3.10/site-packages/marie/jaml/parsers/executor/legacy.py", line 46, in parse                                                                     
           obj = cls(                                                                                                                                                               
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/executors/decorators.py", line 58, in arg_wrapper                                                                 
           f = func(self, *args, **kwargs)                                                                                                                                          
         File "/opt/venv/lib/python3.10/site-packages/marie/serve/helper.py", line 74, in arg_wrapper                                                                               
           f = func(self, *args, **kwargs)                                                                                                                                          
         File "/opt/venv/lib/python3.10/site-packages/marie/executor/text/text_extraction_executor.py", line 62, in __init__                                                        
           self.pipeline = ExtractPipeline(pipeline_config=pipeline, cuda=use_cuda)                                                                                                 
         File "/opt/venv/lib/python3.10/site-packages/marie/pipe/extract_pipeline.py", line 99, in __init__                                                                         
           self.ocr_engines = get_known_ocr_engines(device=device)                                                                                                                  
         File "/opt/venv/lib/python3.10/site-packages/marie/pipe/components.py", line 110, in get_known_ocr_engines                                                                 
           ocr_engines["default"] = DefaultOcrEngine(cuda=use_cuda)                                                                                                                 
         File "/opt/venv/lib/python3.10/site-packages/marie/ocr/default_ocr_engine.py", line 45, in __init__                                                                        
           self.ocr_processor = TrOcrProcessor(                                                                                                                                     
         File "/opt/venv/lib/python3.10/site-packages/marie/document/trocr_ocr_processor.py", line 250, in __init__                                                                 
           ) = init(model_path, beam, device)                                                                                                                                       
         File "/opt/venv/lib/python3.10/site-packages/marie/document/trocr_ocr_processor.py", line 61, in init                                                                      
           model, cfg, inference_task = fairseq.checkpoint_utils.load_model_ensemble_and_task(                                                                                      
         File "/opt/venv/lib/python3.10/site-packages/fairseq/checkpoint_utils.py", line 502, in load_model_ensemble_and_task                                                       
           model = task.build_model(cfg.model, from_checkpoint=True)                                                                                                                
         File "/opt/venv/lib/python3.10/site-packages/fairseq/tasks/fairseq_task.py", line 691, in build_model                                                                      
           model = models.build_model(args, self, from_checkpoint)                                                                                                                  
         File "/opt/venv/lib/python3.10/site-packages/fairseq/models/__init__.py", line 106, in build_model                                                                         
           return model.build_model(cfg, task)                                                                                                                                      
         File "/opt/venv/lib/python3.10/site-packages/marie/models/unilm/trocr/trocr_models.py", line 169, in build_model                                                           
           roberta = torch.hub.load('pytorch/fairseq:main', 'roberta.large')                                                                                                        
         File "/opt/venv/lib/python3.10/site-packages/torch/hub.py", line 566, in load                                                                                              
           model = _load_local(repo_or_dir, model, *args, **kwargs)                                                                                                                 
         File "/opt/venv/lib/python3.10/site-packages/torch/hub.py", line 595, in _load_local                                                                                       
           model = entry(*args, **kwargs)                                                                                                                                           
         File "/opt/venv/lib/python3.10/site-packages/fairseq/models/roberta/model.py", line 380, in from_pretrained                                                                
           return RobertaHubInterface(x["args"], x["task"], x["models"][0])                                                                                                         
         File "/opt/venv/lib/python3.10/site-packages/fairseq/models/roberta/hub_interface.py", line 26, in __init__                                                                
           self.bpe = encoders.build_bpe(cfg.bpe)                                                                                                                                   
         File "/opt/venv/lib/python3.10/site-packages/fairseq/registry.py", line 65, in build_x                                                                                     
           return builder(cfg, *extra_args, **extra_kwargs)                                                                                                                         
         File "/opt/venv/lib/python3.10/site-packages/fairseq/data/encoders/gpt2_bpe.py", line 33, in __init__                                                                      
           vocab_bpe = file_utils.cached_path(cfg.gpt2_vocab_bpe)                                                                                                                   
         File "/opt/venv/lib/python3.10/site-packages/fairseq/file_utils.py", line 174, in cached_path                                                                              
           return get_from_cache(url_or_filename, cache_dir)                                                                                                                        
         File "/opt/venv/lib/python3.10/site-packages/fairseq/file_utils.py", line 299, in get_from_cache                                                                           
           response = request_wrap_timeout(                                                                                                                                         
         File "/opt/venv/lib/python3.10/site-packages/fairseq/file_utils.py", line 251, in request_wrap_timeout                                                                     
           return func(timeout=timeout)                                                                                                                                             
         File "/opt/venv/lib/python3.10/site-packages/requests/api.py", line 100, in head                                                                                           
           return request("head", url, **kwargs)                                                                                                                                    
         File "/opt/venv/lib/python3.10/site-packages/requests/api.py", line 59, in request                                                                                         
           return session.request(method=method, url=url, **kwargs)                                                                                                                 
         File "/opt/venv/lib/python3.10/site-packages/requests/sessions.py", line 589, in request                                                                                   
           resp = self.send(prep, **send_kwargs)                                                                                                                                    
         File "/opt/venv/lib/python3.10/site-packages/requests/sessions.py", line 703, in send                                                                                      
           r = adapter.send(request, **kwargs)                                                                                                                                      
         File "/opt/venv/lib/python3.10/site-packages/requests/adapters.py", line 517, in send                                                                                      
           raise SSLError(e, request=request)                                                                                                                                       
       requests.exceptions.SSLError: HTTPSConnectionPool(host='dl.fbaipublicfiles.com', port=443): Max retries exceeded with url: /fairseq/gpt2_bpe/vocab.bpe                       
       (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate                          
       (_ssl.c:1007)')))                                                                                  

One solution here is to try to cache the file directly and don't attempt to download it.

Rithsek99 commented 8 months ago

host='dl.fbaipublicfiles.com' uses ESET-SSL-Filter-CA. However, if it's not included in "/opt/venv/lib/python3.10/site-packages/certifi/cacert.pem" it would cause SSLError. Possible solution: append the content of ESET-SSL-Filter-CA.pem to "opt/venv/lib/python3.10/site-packages/certifi/cacert.pem"

ref: about:certificate?cert=MIID9DCCAtygAwIBAgIQaWa9I7zQtK01rEhHEueegTANBgkqhkiG9w0BAQsFADBIMRswGQYDVQQDDBJFU0VUIFNTTCBGaWx0ZXIgQ0ExHDAaBgNVBAoME0VTRVQsIHNwb2wuIHMgci4gby4xCzAJBgNVBAYTAlNLMB4XDTIzMDMwNjAwMDAwMFoXDTI0MDMwNTIzNTk1OVowcDELMAkGA1UEBhMCVVMxEzARBgNVBAgTCkNhbGlmb3JuaWExEzARBgNVBAcTCk1lbmxvIFBhcmsxFzAVBgNVBAoTDkZhY2Vib29rLCBJbmMuMR4wHAYDVQQDDBUqLmZiYWlwdWJsaWNmaWxlcy5jb20wggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCs4Ap4RKHy2SU%2FZ7gwFfKfqc1DlFAs2ntYXmCRLMaVBLn2Y8ZwAsgsLTTkpkb8UYnW%2FO79tOpjR9GxoCC9d5VoUPnXJ%2B%2FpzWH%2BNXA2cjOq4VSQ8TsDx8I8WWqyATY405S%2BL7ghOz7SPoRP3srkpcuCQQzPwjjlZ4KLMqktMbF3MdFjaoQo39g4Y3goBrkYBVpJjM1DPtciuQc4kZVaYveo7oczLa%2FsN29lCU2VLSOJLhZ7UHwcpq7%2FV3ZpmHYctruhyIdAgvT2qBiSrlBox7vR5AMzHEgojlGrKfCYAZ1Wq6C33SnMteaSZzWHSmtWyjjjUhYenypxxuZQpWlffblRAgMBAAGjgbEwga4wCwYDVR0PBAQDAgWgMB0GA1UdDgQWBBSo98jaWaS03U5zdsAYys5bj%2FZsEjAfBgNVHSMEGDAWgBR3QZ5210hxi9rZtJ5OlNlw4eP2cjA1BgNVHREELjAsghUqLmZiYWlwdWJsaWNmaWxlcy5jb22CE2ZiYWlwdWJsaWNmaWxlcy5jb20wHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMCMAkGA1UdEwQCMAAwDQYJKoZIhvcNAQELBQADggEBAHPiJuUa4AHjeWtAxcr9Ry7iDmmOl6fjf9kg6T84lp1ticIAYpNiAL0KsgfOczKJM7Nbx756ycldGLHl1IkIYov7XwEJ3x8UACV%2Fr1xHgeC%2BRKdD42RwWmW6%2Bpvchx9tZixmlaMeWhSGVyAi5%2FJ9oNV7jkDCdhQLP8WWR6mZT3znExlRXJQ24qZL4DjfmWVwePL2D2hcxYIEujaiEPhRSYy6RM1ByE%2Fw6BY9RyiKdYgWidqsBSYXgyDtvaPZNUH%2FmCHDS%2BnhcnujH8h6E6Raj0bJOYB8k5jtN15uo%2B4pWTENUEXC9OwGBDYCyHbPoXCzQh%2FAPVutAw8VWCpfDFBSWeo%3D&cert=MIIDgDCCAmigAwIBAgIQGm3Mlv1c%2BvR098q3bghNcTANBgkqhkiG9w0BAQsFADBIMRswGQYDVQQDDBJFU0VUIFNTTCBGaWx0ZXIgQ0ExHDAaBgNVBAoME0VTRVQsIHNwb2wuIHMgci4gby4xCzAJBgNVBAYTAlNLMB4XDTIzMTAzMDE1NTUxN1oXDTMzMTAyNzE1NTUxN1owSDEbMBkGA1UEAwwSRVNFVCBTU0wgRmlsdGVyIENBMRwwGgYDVQQKDBNFU0VULCBzcG9sLiBzIHIuIG8uMQswCQYDVQQGEwJTSzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAJjdOVI2CUSx4ABMhf9Cg4BiqkJzCD%2FMQdhwjvLCrAytOPOvDE2ig8je%2FX3Fn0m2d3QM3rdEZAPHGsJ%2FWgYYa221U7P%2BH%2By5ESj36fF1QwiHsAc6jnFQtHzr1oBaNsBMx9omMGk%2Bxh1OOyfsXSJZue9EPBC1VpnOoV3fV%2BnyLotLk1Aja96SaIqlPwAhX6dFcNMfFjfDM5kvO65ZD4XGT7JPaGB8bZKdMng0p1e38dVbCS2yrGFIuWn82RvMFwNAqAmSGj%2FWAoPKlQvRkoqeFeerR97NboJoS4xcUK4Z5BURueB2xtZS7yzMsAUCY%2FBEHvqAW6vgzC28oqBLFND64ksCAwEAAaNmMGQwDgYDVR0PAQH%2FBAQDAgIEMBIGA1UdEwEB%2FwQIMAYBAf8CAQAwHQYDVR0OBBYEFHdBnnbXSHGL2tm0nk6U2XDh4%2FZyMB8GA1UdIwQYMBaAFHdBnnbXSHGL2tm0nk6U2XDh4%2FZyMA0GCSqGSIb3DQEBCwUAA4IBAQA3s8xsYqE%2BjzIuiz52JcTatGdURjEEnYM6kF4s4rVomc6KGTgBVyk8pTnJHoLaFTneWxKWeLAf4y%2BrPJxLR4u%2Fj9ULkmc9kdRjVpWKd21HQ0zDd5fazT495Q78VWmr2kS8lsOSyvG2EiBMb3YSNo7GG7cViq1p%2FtjxZLxExGy%2BXGGvbmg%2FWdwXUmdwKcvkVT%2BAmeZFyam02DRfLLhrDrC6gwzSZb8ar3vRT%2Fyq0OxgX%2BfJbD3PGYgshU1uiSQK80rYWA2yWm%2BwHiL6N4r4%2F97QZ9Hhj%2Fqhfi881bK4dVGQClqHPTXIkXAyXrbh1Idz7ZrmFXyJRY87CzJCphdcE6l9#