magland / sortingview

Web app for viewing results of ephys spike sorting
Apache License 2.0
25 stars 8 forks source link

Sortingview backend crashes #20

Closed khl02007 closed 3 years ago

khl02007 commented 3 years ago

kachery-daemon is running Tried upgrading kachery, kachery-daemon, and hither

Task requested: sortingview_get_python_package_version.1 (query)                
Finished task: sortingview_get_python_package_version.1                         
Task requested: sortingview_get_python_package_version.1 (query)                
Finished task: sortingview_get_python_package_version.1                         
{'success': False, 'error': 'Error posting json: 403 Missing client auth code in
 daemon request. You probably need to upgrade kachery-daemon or kachery.'}      
Process Process-1:                                                              
Traceback (most recent call last):                                              
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/multiprocess
ing/process.py", line 315, in _bootstrap                                        
    self.run()                                                                  
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/multiprocess
ing/process.py", line 108, in run                                               
    self._target(*self._args, **self._kwargs)                                   
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/kachery_client/task_backend/_run_task_backend_worker.py", line 27, in _run_tas
k_backend_worker                                                                
    requested_tasks = _register_task_functions(registered_task_functions, timeou
t_sec=4)                                                                        
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/kachery_client/task_backend/_run_task_backend_worker.py", line 81, in _registe
r_task_functions    
    raise Exception(f'Error registering task functions.')                       
Exception: Error registering task functions.                                    
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-packages/kachery_client/task_backend/TaskBackend.py", line 100, in _stop_all_task_backends
    s.stop()
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-packages/kachery_client/task_backend/TaskBackend.py", line 33, in stop
    self._run_task_backend_pipe_to_worker.send({'type': 'exit'})
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
    self._send(header + buf)
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/multiprocessing/connection.py", line 368, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
magland commented 3 years ago

This is surprising to me. I added some checks and made a potential fix. Please upgrade to

kachery-daemon >=1.0.20 and kachery-client >=1.0.11

khl02007 commented 3 years ago

Here is another type of crash

Traceback (most recent call last):                                              
  File "/home/kacheryuser/miniconda3/envs/kachery-env/bin/sortingview-start-back
end", line 6, in <module>                                                       
    sortingview.start_backend_cli()                                             
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/click/core.py", line 1137, in __call__                                        
    return self.main(*args, **kwargs)                                           
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/click/core.py", line 1062, in main                                            
    rv = self.invoke(ctx)                                                       
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/click/core.py", line 1404, in invoke                                          
    return ctx.invoke(self.callback, **ctx.params)                              
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/click/core.py", line 763, in invoke                                           
    return __callback(*args, **kwargs)                                          
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/sortingview/backend/start_backend_cli.py", line 8, in start_backend_cli       
    start_backend(channel=channel)                                              
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/sortingview/backend/start_backend.py", line 14, in start_backend              
    kc.run_task_backend(                                                        
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-packages/kachery_client/task_backend/run_task_backend.py", line 32, in run_task_backend
    B.process_events()                  
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-packages/kachery_client/task_backend/TaskBackend.py", line 47, in process_events
    self._task_job_manager.process_events()                                     
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/kachery_client/task_backend/TaskJobManager.py", line 52, in process_events
    requested_task.update_status(status=job.status, error_message=error_message, result=result)                                                                 
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/kachery_client/task_backend/RequestedTask.py", line 35, in update_status      
    _update_task_status(channel=self.registered_task_function.channel, task_id=s
elf.task_id, task_function_id=self._registered_task_function.task_function_id, t
ask_hash=self.task_hash, task_function_type=self.task_function_type, status=stat
us, result=result, error_message=error_message)                                 
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/kachery_client/task_backend/_update_task_status.py", line 15, in _update_task_
status                                                                          
    result_content = simplejson.dumps(result, separators=(',', ':'), indent=None
, allow_nan=False).encode()                                                     
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/simplejson/__init__.py", line 398, in dumps                                   
    return cls(                         
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/simplejson/encoder.py", line 296, in encode
    chunks = self.iterencode(o, _one_shot=True)                                 
  File "/home/kacheryuser/miniconda3/envs/kachery-env/lib/python3.8/site-package
s/simplejson/encoder.py", line 378, in iterencode
    return _iterencode(o, 0)                                                    
ValueError: Out of range float values are not JSON compliant                    
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler                                                
Cleaning up parallel job handler  
magland commented 3 years ago

@khl02007 I think this is another example of NaN creeping into the results of some backend tasks. But obviously we don't want that to crash the backend. So I updated the backend so it won't crash on these instances.

You'll need to pip upgrade sortingview to 0.2.25 (and that should automatically bump kachery-client to >= 1.0.12). Then restart the backend, and report any further crashes.

magland commented 3 years ago

Closing because I think this has been resolved.