AllenNeuralDynamics / aind-watchdog-service

Data staging service that prepares acquistion data for cloud upload to Amazon S3 and Code Ocean processing.
https://allenneuraldynamics.github.io/aind-watchdog-service/
MIT License
2 stars 1 forks source link

Watchdog service fails to upload #33

Closed alexpiet closed 2 months ago

alexpiet commented 3 months ago

Describe the bug Upload is triggered, but runs into an error.

2024-07-29 10:51:35,869 - root - INFO - Found event file C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_708026_2024-07-29_09-23-46.yml
2024-07-29 10:51:35,912 - root - INFO - Scheduling job to run at 2024-07-29 23:00:00 C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_708026_2024-07-29_09-23-46.yml
2024-07-29 10:51:36,193 - apscheduler.scheduler - INFO - Added job "RunJob.run_job" to job store "default"
2024-07-29 11:06:16,439 - root - INFO - Found event file C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_0_2024-07-29_11-05-55.yml
2024-07-29 11:06:16,451 - root - INFO - Scheduling job to run at 2024-07-29 23:00:00 C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_0_2024-07-29_11-05-55.yml
2024-07-29 11:06:16,452 - apscheduler.scheduler - INFO - Added job "RunJob.run_job" to job store "default"
2024-07-29 23:00:00,045 - apscheduler.scheduler - INFO - Removed job 07d98d271e5f425b9c5a71d514bd8746
2024-07-29 23:00:00,048 - apscheduler.executors.default - INFO - Running job "RunJob.run_job (trigger: date[2024-07-29 23:00:00 PDT], next run at: 2024-07-29 23:00:00 PDT)" (scheduled at 2024-07-29 23:00:00-07:00)
2024-07-29 23:00:00,078 - root - INFO - Running job for C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_708026_2024-07-29_09-23-46.yml
2024-07-29 23:00:00,083 - apscheduler.executors.default - INFO - Running job "RunJob.run_job (trigger: date[2024-07-29 23:00:00 PDT], next run at: 2024-07-29 23:00:00 PDT)" (scheduled at 2024-07-29 23:00:00-07:00)
2024-07-29 23:00:00,083 - root - INFO - Running job for C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_0_2024-07-29_11-05-55.yml
2024-07-29 23:00:00,083 - apscheduler.scheduler - INFO - Removed job 2fd568dc7d0b4ad593c90d944d4463f2
2024-07-29 23:00:00,225 - root - INFO - Executing command: ['robocopy', 'C:/behavior_data/446-6-D/0/behavior_0_2024-07-29_11-05-55/behavior', WindowsPath('//allen/aind/scratch/dynamic_foraging_rig_transfer/behavior_0_2024-07-29_11-05-55/behavior'), '/z', '/e', '/j', '/r:5']
2024-07-29 23:00:00,229 - root - INFO - Executing command: ['robocopy', 'C:/behavior_data/446-6-D/708026/behavior_708026_2024-07-29_09-23-46/behavior', WindowsPath('//allen/aind/scratch/dynamic_foraging_rig_transfer/behavior_708026_2024-07-29_09-23-46/behavior'), '/z', '/e', '/j', '/r:5']
2024-07-29 23:00:03,964 - root - INFO - Executing command: ['robocopy', 'C:/behavior_data/446-6-D/0/behavior_0_2024-07-29_11-05-55/behavior-videos', WindowsPath('//allen/aind/scratch/dynamic_foraging_rig_transfer/behavior_0_2024-07-29_11-05-55/behavior-videos'), '/z', '/e', '/j', '/r:5']
2024-07-29 23:00:06,051 - root - INFO - Executing command: ['robocopy', 'C:/behavior_data/446-6-D/0/behavior_0_2024-07-29_11-05-55/fib', WindowsPath('//allen/aind/scratch/dynamic_foraging_rig_transfer/behavior_0_2024-07-29_11-05-55/fib'), '/z', '/e', '/j', '/r:5']
2024-07-29 23:00:06,276 - root - INFO - Executing command: ['robocopy', 'C:\\behavior_data\\446-6-D\\0\\behavior_0_2024-07-29_11-05-55\\metadata-dir', '//allen/aind/scratch/dynamic_foraging_rig_transfer\\behavior_0_2024-07-29_11-05-55', 'session.json', '/j', '/r:5']
2024-07-29 23:00:06,770 - root - INFO - Executing command: ['robocopy', 'C:\\behavior_data\\446-6-D\\0\\behavior_0_2024-07-29_11-05-55\\metadata-dir', '//allen/aind/scratch/dynamic_foraging_rig_transfer\\behavior_0_2024-07-29_11-05-55', 'rig.json', '/j', '/r:5']
2024-07-29 23:00:08,208 - root - INFO - Job complete for C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_0_2024-07-29_11-05-55.yml
2024-07-29 23:00:08,209 - root - INFO - Executing command: ['robocopy', 'C:\\Users\\svc_aind_behavior\\Documents\\aind_watchdog_service\\manifest', 'C:\\Users\\svc_aind_behavior\\Documents\\aind_watchdog_service\\manifest\\manifest_complete', 'manifest_behavior_0_2024-07-29_11-05-55.yml', '/j', '/r:5']
2024-07-29 23:00:08,237 - root - INFO - Found event file C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_0_2024-07-29_11-05-55.yml
2024-07-29 23:00:08,240 - apscheduler.executors.default - ERROR - Job "RunJob.run_job (trigger: date[2024-07-29 23:00:00 PDT], next run at: 2024-07-29 23:00:00 PDT)" raised an exception
Traceback (most recent call last):
  File "apscheduler\executors\base.py", line 125, in run_job
  File "aind_watchdog_service\run_job.py", line 300, in run_job
  File "aind_watchdog_service\run_job.py", line 245, in move_manifest_to_archive
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\svc_aind_behavior\\Documents\\aind_watchdog_service\\manifest\\manifest_behavior_0_2024-07-29_11-05-55.yml'
2024-07-29 23:00:08,244 - root - INFO - Scheduling job to run at 2024-07-30 23:00:00 C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_0_2024-07-29_11-05-55.yml
2024-07-29 23:00:08,246 - apscheduler.scheduler - INFO - Added job "RunJob.run_job" to job store "default"
2024-07-29 23:00:13,996 - root - INFO - Executing command: ['robocopy', 'C:/behavior_data/446-6-D/708026/behavior_708026_2024-07-29_09-23-46/behavior-videos', WindowsPath('//allen/aind/scratch/dynamic_foraging_rig_transfer/behavior_708026_2024-07-29_09-23-46/behavior-videos'), '/z', '/e', '/j', '/r:5']
2024-07-29 23:30:43,248 - root - INFO - Executing command: ['robocopy', 'C:/behavior_data/446-6-D/708026/behavior_708026_2024-07-29_09-23-46/fib', WindowsPath('//allen/aind/scratch/dynamic_foraging_rig_transfer/behavior_708026_2024-07-29_09-23-46/fib'), '/z', '/e', '/j', '/r:5']
2024-07-29 23:43:01,688 - root - INFO - Executing command: ['robocopy', 'C:\\behavior_data\\446-6-D\\708026\\behavior_708026_2024-07-29_09-23-46\\metadata-dir', '//allen/aind/scratch/dynamic_foraging_rig_transfer\\behavior_708026_2024-07-29_09-23-46', 'session.json', '/j', '/r:5']
2024-07-29 23:43:02,052 - root - INFO - Executing command: ['robocopy', 'C:\\behavior_data\\446-6-D\\708026\\behavior_708026_2024-07-29_09-23-46\\metadata-dir', '//allen/aind/scratch/dynamic_foraging_rig_transfer\\behavior_708026_2024-07-29_09-23-46', 'rig.json', '/j', '/r:5']
2024-07-29 23:43:02,752 - root - INFO - Job complete for C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_708026_2024-07-29_09-23-46.yml
2024-07-29 23:43:02,752 - root - INFO - Executing command: ['robocopy', 'C:\\Users\\svc_aind_behavior\\Documents\\aind_watchdog_service\\manifest', 'C:\\Users\\svc_aind_behavior\\Documents\\aind_watchdog_service\\manifest\\manifest_complete', 'manifest_behavior_708026_2024-07-29_09-23-46.yml', '/j', '/r:5']
2024-07-29 23:43:02,781 - root - INFO - Found event file C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_708026_2024-07-29_09-23-46.yml
2024-07-29 23:43:02,783 - apscheduler.executors.default - ERROR - Job "RunJob.run_job (trigger: date[2024-07-29 23:00:00 PDT], next run at: 2024-07-29 23:00:00 PDT)" raised an exception
Traceback (most recent call last):
  File "apscheduler\executors\base.py", line 125, in run_job
  File "aind_watchdog_service\run_job.py", line 300, in run_job
  File "aind_watchdog_service\run_job.py", line 245, in move_manifest_to_archive
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\svc_aind_behavior\\Documents\\aind_watchdog_service\\manifest\\manifest_behavior_708026_2024-07-29_09-23-46.yml'
2024-07-29 23:43:02,788 - root - INFO - Scheduling job to run at 2024-07-30 23:00:00 C:\Users\svc_aind_behavior\Documents\aind_watchdog_service\manifest\manifest_behavior_708026_2024-07-29_09-23-46.yml
2024-07-29 23:43:02,788 - apscheduler.scheduler - INFO - Added job "RunJob.run_job" to job store "default"
bruno-f-cruz commented 2 months ago

This seems like a problem with the application that wrote the file PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\svc_aind_behavior\\Documents\\aind_watchdog_service\\manifest\\manifest_behavior_708026_2024-07-29_09-23-46.yml'

Is it possible that the file was not properly closed?

alexpiet commented 2 months ago

I think @arielleleon fixed this issue, since we triggered a few sessions yesterday.

arielleleon commented 2 months ago

@bruno-f-cruz - it should be closed. It's loaded with a context manager and stored in a variable. I think it's a windows permission error and has to do with the permissions for the account executing the application.